Driven by the rapid growth of content traffic and the demand for service quality, Internet content providers (CPs) have started to bypass transit providers and connect with access providers directly via private peering agreements. This peering relationship often raises disputes, e.g., Netflix vs. Comcast, and is not well understood. In this paper, we build a peering contract model and propose the concept of contractual equilibrium, based on which we study the formation and evolution of peering contracts. By using market data, we emulate the strategic peering behavior of providers and shed light on the understanding of private peering agreements. We reveal that the superiority and market dominance of providers primarily determine their peering strategies. We show that 1) superior providers tend to engage in peering more aggressively, and 2) non-dominant CPs' optimal peering strategies are negatively correlated due to market cannibalism, while the dominant CP often behaves oppositely. Our findings help explain phenomena such as why Netflix and Comcast signed the first peering contract, and reason whether private peering contracts will strengthen in future.

Zero queueing delay is highly desirable in large-scale computing systems. Existing work has shown that it can be asymptotically achieved by using the celebrated Power-of-d-choices (pod) policy with a probe overhead $d = ømegałeft(\fracłog N 1-łambda \right)$, and it is impossible when $d = Ołeft(\frac1 1-łambda \right)$, where N is the number of servers and $łambda$ is the load of the system. However, these results are based on the model where each job is an indivisible unit, which does not capture the parallel structure of jobs in today's predominant parallel computing paradigm. This paper thus considers a model where each job consists of a batch of parallel tasks. Under this model, we propose a new notion of zero (asymptotic) queueing delay that requires the job delay under a policy to approach the job delay given by the max of its tasks' service times, i.e., the job delay assuming its tasks entered service right upon arrival. This notion quantifies the effect of queueing on a job level for jobs consisting of multiple tasks, and thus deviates from the conventional zero queueing delay for single-task jobs in the literature. We show that zero queueing delay for parallel jobs can be achieved using the batch-filling policy (a variant of the celebrated \pod\ policy) with a probe overhead $d = ømegałeft(\frac1 (1-łambda)łog k \right)$ in the sub-Halfin-Whitt heavy-traffic regime, where k is the number of tasks in each job and k properly scales with N (the number of servers). This result demonstrates that for parallel jobs, zero queueing delay can be achieved with a smaller probe overhead. We also establish an impossibility result: we show that zero queueing delay cannot be achieved if $d = \expłeft(ołeft(\fracłog N łog k \right) \right)$. Simulation results are provided to demonstrate the consistency between numerical results and theoretical results under reasonable settings, and to investigate gaps in the theoretical analysis.

The Gittins scheduling policy minimizes the mean response in the single-server M/G/1 queue in a wide variety of settings. Most famously, Gittins is optimal when preemption is allowed and service requirements are unknown but drawn from a known distribution. Gittins is also optimal much more generally, adapting to any amount of available information and any preemption restrictions. However, scheduling to minimize mean response time in a multiserver setting, specifically the central-queue M/G/k, is a much more difficult problem. In this work we give the first general analysis of Gittins in the M/G/k. Specifically, we show that under extremely general conditions, Gittins's mean response time in the M/G/k is at most its mean response time in the M/G/1 plus an $O(łog(1/(1 - ρ)))$ additive term, where ρ is the system load. A consequence of this result is that Gittins is heavy-traffic optimal in the M/G/k if the service requirement distribution S satisfies $\mathbfE [S^2(łog S)^+] < \infty$. This is the most general result on minimizing mean response time in the M/G/k to date. To prove our results, we combine properties of the Gittins policy and Palm calculus in a novel way. Notably, our technique overcomes the limitations of tagged job methods that were used in prior scheduling analyses.

The intrinsic hardware imperfection of WiFi chipsets manifests itself in the transmitted signal, leading to a unique radiometric fingerprint. This fingerprint can be used as an additional means of authentication to enhance security. In fact, recent works propose practical fingerprinting solutions that can be readily implemented in commercial-off-the-shelf devices. In this paper, we prove analytically and experimentally that these solutions are highly vulnerable to impersonation attacks. We also demonstrate that such a unique device-based signature can be abused to violate privacy by tracking the user device, and, as of today, users do not have any means to prevent such privacy attacks other than turning off the device. We propose RF-Veil, a radiometric fingerprinting solution that not only is robust against impersonation attacks but also protects user privacy by obfuscating the radiometric fingerprint of the transmitter for non-legitimate receivers. Specifically, we introduce a randomized pattern of phase errors to the transmitted signal such that only the intended receiver can extract the original fingerprint of the transmitter. In a series of experiments and analyses, we expose the vulnerability of adopting naive randomization to statistical attacks and introduce countermeasures. Finally, we show the efficacy of RF-Veil experimentally in protecting user privacy and enhancing security. More importantly, our proposed solution allows communicating with other devices, which do not employ RF-Veil.

Applications in cloud platforms motivate the study of efficient load balancing under job-server constraints and server heterogeneity. In this paper, we study load balancing on a bipartite graph where left nodes correspond to job types and right nodes correspond to servers, with each edge indicating that a job type can be served by a server. Thus edges represent locality constraints, i.e., an arbitrary job can only be served at servers which contain certain data and/or machine learning (ML) models. Servers in this system can have heterogeneous service rates. In this setting, we investigate the performance of two policies named Join-the-Fastest-of-the-Shortest-Queue (JFSQ) and Join-the-Fastest-of-the-Idle-Queue (JFIQ), which are simple variants of Join-the-Shortest-Queue and Join-the-Idle-Queue, where ties are broken in favor of the fastest servers. Under a "well-connected'' graph condition, we show that JFSQ and JFIQ are asymptotically optimal in the mean response time when the number of servers goes to infinity. In addition to asymptotic optimality, we also obtain upper bounds on the mean response time for finite-size systems. We further show that the well-connectedness condition can be satisfied by a random bipartite graph construction with relatively sparse connectivity.

We consider a system with N~parallel servers where incoming jobs are immediately replicated to, say, d~servers. Each of the N servers has its own queue and follows a FCFS discipline. As soon as the first job replica is completed, the remaining replicas are abandoned. We investigate the achievable stability region for a quite general workload model with different job types and heterogeneous servers, reflecting job-server affinity relations which may arise from data locality issues and soft compatibility constraints. Under the assumption that job types are known beforehand we show for New-Better-than-Used (NBU) distributed speed variations that no replication $(d=1)$ gives a strictly larger stability region than replication $(d>1)$. Strikingly, this does not depend on the underlying distribution of the intrinsic job sizes, but observing the job types is essential for this statement to hold. In case of non-observable job types we show that for New-Worse-than-Used (NWU) distributed speed variations full replication ($d=N$) gives a larger stability region than no replication $(d=1)$.

We consider the question of identifying which set of products are purchased and at what prices in a given transaction by observing only the total amount spent in the transaction, and nothing more. The ability to solve such an inverse problem can lead to refined information about consumer spending by simply observing anonymized credit card transactions data. Indeed, when considered in isolation, it is impossible to identify the products purchased and their prices from a given transaction just based on the transaction total. However, given a large number of transactions, there may be a hope. As the main contribution of this work, we provide a robust estimation algorithm for decomposing transaction totals into the underlying, individual product(s) purchased by utilizing a large corpus of transactions. Our method recovers a (product prices) vector $p \in \mathbbR ^N_>0 $ of unknown dimension (number of products) N as well as matrix $A \in \mathbbZ ^M \times N _\geq0 $ simply from M observations (transaction totals) $y \in \mathbbR ^M_>0 $ such that $y = A p + η$ with η \in \mathbbR ^M$ representing noise (taxes, discounts, etc.). We formally establish that our algorithm identifies $N, A$ precisely and p approximately, as long as each product is purchased individually at least once, i.e. $M \geq N$ and A has rank N. Computationally, the algorithm runs in polynomial time (with respect to problem parameters), and thus we provide a computationally efficient and statistically robust method for solving such inverse problems. We apply the algorithm to a large corpus of anonymized consumer credit card transactions in the period 2016-2019, with data obtained from a commercial data vendor. The transactions are associated with spending at Apple, Chipotle, Netflix, and Spotify. From just transactions data, our algorithm identifies (i) key price points (without access to the listed prices), (ii) products purchased within a transaction, (iii) product launches, and (iv) evidence of a new 'secret' product from Netflix - rumored to be in limited release.

We analyze the performance of redundancy in a multi-type job and multi-type server system. We assume the job dispatcher is unaware of the servers' capacities, and we set out to study under which circumstances redundancy improves the performance. With redundancy an arriving job dispatches redundant copies to all its compatible servers, and departs as soon as one of its copies completes service. As a benchmark comparison, we take the non-redundant system in which a job arrival is routed to only one randomly selected compatible server. Service times are generally distributed and all copies of a job are identical, i.e., have the same service requirement. In our first main result, we characterize the sufficient and necessary stability conditions of the redundancy system. This condition coincides with that of a system where each job type only dispatches copies into its least-loaded servers, and those copies need to be fully served. In our second result, we compare the stability regions of the system under redundancy to that of no redundancy. We show that if the server's capacities are sufficiently heterogeneous, the stability region under redundancy can be much larger than that without redundancy. We apply the general solution to particular classes of systems, including redundancy-d and nested models, to derive simple conditions on the degree of heterogeneity required for redundancy to improve the stability. As such, our result is the first in showing that redundancy can improve the stability and hence performance of a system when copies are non-i.i.d..

High scalability and low running costs have made fuzz testing the de facto standard for discovering software bugs. Fuzzing techniques are constantly being improved in a race to build the ultimate bug-finding tool. However, while fuzzing excels at finding bugs in the wild, evaluating and comparing fuzzer performance is challenging due to the lack of metrics and benchmarks. For example, crash count---perhaps the most commonly-used performance metric---is inaccurate due to imperfections in deduplication techniques. Additionally, the lack of a unified set of targets results in ad hoc evaluations that hinder fair comparison. We tackle these problems by developing Magma, a ground-truth fuzzing benchmark that enables uniform fuzzer evaluation and comparison. By introducing real bugs into real software, Magma allows for the realistic evaluation of fuzzers against a broad set of targets. By instrumenting these bugs, Magma also enables the collection of bug-centric performance metrics independent of the fuzzer. Magma is an open benchmark consisting of seven targets that perform a variety of input manipulations and complex computations, presenting a challenge to state-of-the-art fuzzers. We evaluate seven widely-used mutation-based fuzzers (AFL, AFLFast, AFL++, FairFuzz, MOpt-AFL, honggfuzz, and SymCC-AFL) against Magma over 200,000 CPU-hours. Based on the number of bugs reached, triggered, and detected, we draw conclusions about the fuzzers' exploration and detection capabilities. This provides insight into fuzzer performance evaluation, highlighting the importance of ground truth in performing more accurate and meaningful evaluations.

The production of counterfeit money has a long history. It refers to the creation of imitation currency that is produced without the legal sanction of government. With the growth of the cryptocurrency ecosystem, there is expanding evidence that counterfeit cryptocurrency has also appeared. In this paper, we empirically explore the presence of counterfeit cryptocurrencies on Ethereum and measure their impact. By analyzing over 190K ERC-20 tokens (or cryptocurrencies) on Ethereum, we have identified $2,117$ counterfeit tokens that target 94 of the 100 most popular cryptocurrencies. We perform an end-to-end characterization of the counterfeit token ecosystem, including their popularity, creators and holders, fraudulent behaviors and advertising channels. Through this, we have identified two types of scams related to counterfeit tokens and devised techniques to identify such scams. We observe that over 7,104 victims were deceived in these scams, and the overall financial loss sums to a minimum of \$ 17 million (74,271.7 ETH). Our findings demonstrate the urgency to identify counterfeit cryptocurrencies and mitigate this threat.

We introduce and study a general version of the fractional online knapsack problem with multiple knapsacks, heterogeneous constraints on which items can be assigned to which knapsack, and rate-limiting constraints on the assignment of items to knapsacks. This problem generalizes variations of the knapsack problem and of the one-way trading problem that have previously been treated separately, and additionally finds application to the real-time control of electric vehicle (EV) charging. We introduce a new algorithm that achieves a competitive ratio within an additive factor of one of the best achievable competitive ratios for the general problem and matches or improves upon the best-known competitive ratio for special cases in the knapsack and one-way trading literatures. Moreover, our analysis provides a novel approach to online algorithm design based on an instance-dependent primal-dual analysis that connects the identification of worst-case instances to the design of algorithms. Finally, we illustrate the proposed algorithm via trace-based experiments of EV charging.

This paper considers online convex optimization (OCO) problems where decisions are constrained by available energy resources. A key scenario is optimal power control for an energy harvesting device with a finite capacity battery. The goal is to minimize a time-average loss function while keeping the used energy less than what is available. In this setup, the distribution of the randomly arriving harvestable energy (which is assumed to be i.i.d.) is unknown, the current loss function is unknown, and the controller is only informed by the history of past observations. A prior algorithm is known to achieve $O(\sqrtT )$ regret by using a battery with an $O(\sqrtT )$ capacity. This paper develops a new algorithm that maintains this asymptotic trade-off with the number of time steps T while improving dependency on the dimension of the decision vector from $O(\sqrtn )$ to $O(\sqrtłog(n) )$. The proposed algorithm introduces a separation of the decision vector into amplitude and direction components. It uses two distinct types of Bregman divergence, together with energy queue information, to make decisions for each component.