Do they belong to you? Claim these comments.
George
Is this you? Claim Profile »
3 years ago
in Multicast and Network Neutrality on The Technology Liberation Front
And thanks Tim for being such a reasonable, open-minded person, and for putting truth ahead of your own desires to be right. It is refreshing and inspiring, and makes participating at TLF a lot of fun (even when it's over a rather minor technical issue that is largely irrelevant to the main discussion :) ).
3 years ago
in Multicast and Network Neutrality on The Technology Liberation Front
Tim, re:costs of caching.
I'm not against ISP caching. Certainly, for many types of traffic, it helps. But I maintain that caching data at the edge of the network a priori as a scheme for distributing content with the goal of reducing costs is a much less-cost-effective solution than using adaptive p2p networks. If you aren't convinced of this by a simple cost analysis of what you have to pay to get Akamai (or one of its competitors) to distribute your content vs. what you have to pay to use BitTorrent, RedSwoosh, or one of its competitors, I'm not sure that anything I say will convince you. But I thought it might be useful to describe the technical reasons why the cost structure for centralized, hierarchical network distribution channels will always be much greater than that of democratic on-demand cooperative swarming distribution.
One of these technicalities is cache size and cache replacement policies. As long as the only thing we are talking about is episodes of Desperate Housewives, it is easy for ISPs or Akamai to provision enough storage resources at the nodes on the edge of the network to cache this content. But we know that users will want more than just Desperate Housewives. In fact, we know that traffic in today's p2p filesharing networks follows the long-tail distribution, and that the total amount of bytes consumed by the long tail dwarfs that of the blockbuster spike. So you can set up a cache of finite size at the edge, and you can cache the most popular content, but you can't possibly ever hope to have enough storage to contain the majority of traffic that will be generated by the long tail.
So if you are an ISP, what do you do for all that long-tail content? You hope that your users are using something like BitTorrent, which will prefer to download from peers that are near because those peers will, by their very nature, be able to provide the best service. And, lucky for you, those peers that are closest will be on your own network, so you won't be incurring costs going out to the backbone (just as if they had a cache on your own network).
Yes, if no one on your local network is sharing a copy of a particular file, your users will have to go out to the backbone (at least once). But that's the absolute best you can do. Period. Even if you have a cache, if none of your users have accessed that particular file recently, it will get expunged from the cache soon anyway (because no matter how big, your cache is finite, and because the long tail causes a lot of expunging).
How about this: pretend your ISP cache is a distributed cache. Now, think of the nodes of that distributed cache residing not on your own servers in your data center, but out on your customer's PCs. It's still a local cache. Only its bigger, smarter, cheaper, and more efficient than your centralized ISP cache could ever be. And that is why, sooner or later, the 'push data to edge node caches' model is doomed.
I hope I'm not derailing the discussion, and I'm not sure if any of this affects your core point about net neutrality, but I really feel strongly about p2p as an incredibly robust architecture (the fact that it is most commonly associated with pirating media is a shame).
I'm not against ISP caching. Certainly, for many types of traffic, it helps. But I maintain that caching data at the edge of the network a priori as a scheme for distributing content with the goal of reducing costs is a much less-cost-effective solution than using adaptive p2p networks. If you aren't convinced of this by a simple cost analysis of what you have to pay to get Akamai (or one of its competitors) to distribute your content vs. what you have to pay to use BitTorrent, RedSwoosh, or one of its competitors, I'm not sure that anything I say will convince you. But I thought it might be useful to describe the technical reasons why the cost structure for centralized, hierarchical network distribution channels will always be much greater than that of democratic on-demand cooperative swarming distribution.
One of these technicalities is cache size and cache replacement policies. As long as the only thing we are talking about is episodes of Desperate Housewives, it is easy for ISPs or Akamai to provision enough storage resources at the nodes on the edge of the network to cache this content. But we know that users will want more than just Desperate Housewives. In fact, we know that traffic in today's p2p filesharing networks follows the long-tail distribution, and that the total amount of bytes consumed by the long tail dwarfs that of the blockbuster spike. So you can set up a cache of finite size at the edge, and you can cache the most popular content, but you can't possibly ever hope to have enough storage to contain the majority of traffic that will be generated by the long tail.
So if you are an ISP, what do you do for all that long-tail content? You hope that your users are using something like BitTorrent, which will prefer to download from peers that are near because those peers will, by their very nature, be able to provide the best service. And, lucky for you, those peers that are closest will be on your own network, so you won't be incurring costs going out to the backbone (just as if they had a cache on your own network).
Yes, if no one on your local network is sharing a copy of a particular file, your users will have to go out to the backbone (at least once). But that's the absolute best you can do. Period. Even if you have a cache, if none of your users have accessed that particular file recently, it will get expunged from the cache soon anyway (because no matter how big, your cache is finite, and because the long tail causes a lot of expunging).
How about this: pretend your ISP cache is a distributed cache. Now, think of the nodes of that distributed cache residing not on your own servers in your data center, but out on your customer's PCs. It's still a local cache. Only its bigger, smarter, cheaper, and more efficient than your centralized ISP cache could ever be. And that is why, sooner or later, the 'push data to edge node caches' model is doomed.
I hope I'm not derailing the discussion, and I'm not sure if any of this affects your core point about net neutrality, but I really feel strongly about p2p as an incredibly robust architecture (the fact that it is most commonly associated with pirating media is a shame).
3 years ago
in Multicast and Network Neutrality on The Technology Liberation Front
Tim, you've got some erroneous perceptions of p2p, multicast, and Akamai here that I'd like to address.
#1. multicast theoretically only works if all 'consumers' of the stream are consuming at roughly the same time (and even then, the technical hurdles are significant). This is because multicast doesn't cache the bits for later use. It isn't a good solution for users who want to view content on-demand.
#2. Akamai's Edgecast solution does indeed address the shortcomings of multicast, by caching data at each node. The inner-workings of distribution among these nodes is not publicly known in detail, but your assumptions are probably not far off.
#3. Here's the big technical error: you claim that a p2p model is less efficient than Akamai. You are wrong for several reasons:
a) Akamai has to physically place nodes, and the distribution of those nodes must match demand. This is problematic for several reasons. Suppose the content provider assumes that 10 million users in the US will download a copy of their content, and that these users will be more or less uniformly distributed throughout the country. But then, to their surprise, 8 million people in Arkansas try to download the show, and only 1 million in the rest of the country. The Arkansas residents will be starved for capacity (resulting in millions of failed downloads), while the rest of the country will be overserved with capacity. With a p2p distribution system, each consumer becomes a provider, and capacity automatically moves to where it is needed. This is commonly referred to as 'swarming', and is what happens with products like BitTorrent and RedSwoosh. Because there is a 1-to-1 relationship to consuming/providing content, capacity is only limited by total network capacity -- you can't beat that for performance!
b) You claim that ISPs (and Akamai) have better knowledge of physical topologies, and can thus better place caching nodes in the topology. Yet physical topology is irrelevant. This is counterintuitive, but the only metric that matters in networking is current network performance. Physical proximity or channel capacity doesn't matter, because performance (bps) is variable. Even if nodeA is physically closer to me and the connection to it, pipeA, has higher capacity, it does me no good to prefer it over further away nodeB and lower capacity pipeB if pipeA is currently congested and pipeB is not.
c) Cost and efficiencies. This derives a bit from a) and b), but the reason Akamai will always be more expensive to run than BitTorrent (even if all bandwidth were 100% free) is that you have to have an army of Akamai installations and support staff to service those 18,000 nodes, and the staff has to be distributed around the globe. Even if your support army works for free, you still have to pay for electricity, physical facilities, and bandwidth. Spread your content with BitTorrent instead, and you can get rid of all that staff and all those physical installations.
In short, I'd sum up centralized, hierarchical distribution strategies as analgous to centrally planned, hierarchical economies. The Akamai design is the design of a network communist, centrally planned by committee, politburo, or dictator -- and no matter how well it is planned, it can never respond to changing market forces as well as a free market can. (BTW, that isn't a criticism of Akamai as a capitalist business or even as an engineering design -- Akamai continues to pull in wonderful revenues, and at the time it was designed, it was revolutionary. It's just that Adam Smith hadn't come along yet...)
#1. multicast theoretically only works if all 'consumers' of the stream are consuming at roughly the same time (and even then, the technical hurdles are significant). This is because multicast doesn't cache the bits for later use. It isn't a good solution for users who want to view content on-demand.
#2. Akamai's Edgecast solution does indeed address the shortcomings of multicast, by caching data at each node. The inner-workings of distribution among these nodes is not publicly known in detail, but your assumptions are probably not far off.
#3. Here's the big technical error: you claim that a p2p model is less efficient than Akamai. You are wrong for several reasons:
a) Akamai has to physically place nodes, and the distribution of those nodes must match demand. This is problematic for several reasons. Suppose the content provider assumes that 10 million users in the US will download a copy of their content, and that these users will be more or less uniformly distributed throughout the country. But then, to their surprise, 8 million people in Arkansas try to download the show, and only 1 million in the rest of the country. The Arkansas residents will be starved for capacity (resulting in millions of failed downloads), while the rest of the country will be overserved with capacity. With a p2p distribution system, each consumer becomes a provider, and capacity automatically moves to where it is needed. This is commonly referred to as 'swarming', and is what happens with products like BitTorrent and RedSwoosh. Because there is a 1-to-1 relationship to consuming/providing content, capacity is only limited by total network capacity -- you can't beat that for performance!
b) You claim that ISPs (and Akamai) have better knowledge of physical topologies, and can thus better place caching nodes in the topology. Yet physical topology is irrelevant. This is counterintuitive, but the only metric that matters in networking is current network performance. Physical proximity or channel capacity doesn't matter, because performance (bps) is variable. Even if nodeA is physically closer to me and the connection to it, pipeA, has higher capacity, it does me no good to prefer it over further away nodeB and lower capacity pipeB if pipeA is currently congested and pipeB is not.
c) Cost and efficiencies. This derives a bit from a) and b), but the reason Akamai will always be more expensive to run than BitTorrent (even if all bandwidth were 100% free) is that you have to have an army of Akamai installations and support staff to service those 18,000 nodes, and the staff has to be distributed around the globe. Even if your support army works for free, you still have to pay for electricity, physical facilities, and bandwidth. Spread your content with BitTorrent instead, and you can get rid of all that staff and all those physical installations.
In short, I'd sum up centralized, hierarchical distribution strategies as analgous to centrally planned, hierarchical economies. The Akamai design is the design of a network communist, centrally planned by committee, politburo, or dictator -- and no matter how well it is planned, it can never respond to changing market forces as well as a free market can. (BTW, that isn't a criticism of Akamai as a capitalist business or even as an engineering design -- Akamai continues to pull in wonderful revenues, and at the time it was designed, it was revolutionary. It's just that Adam Smith hadn't come along yet...)