P2P for Personalized Service Infrastructure
- my idea for P2P's research and development direction
Skype is an obvious success for P2P to replace traditional legacy communication service. In this voice service area, P2P has breaked the traditional telecom service model and make end user communicate End-to-End with Minimal Cost, i.e. the server can be avoided and the infrastructure have minimal cost but with excellent scalability.
P2P has the potential to brings same breakthrough in Internet service area. This is the direction.
It face challenges:
- Firstly, there is not such a strong incentive for a end user to replace centralized Web Host service with P2P to minimize cost, he/she can find FREE host service in Internet now. Currently, most of revenue of Internet company is from advertisement, e.g. Google (4B), Shoping, e.g. Amazon, transaction intermediate fee, e.g. eBay with its Paypal. Many Web service are provided freely, e.g. Google provides Mail, Blog, Group, Search, Froogle, Calendar, Base, Map, etc. totally freely for end user.
- Secondly, for a merchant who want to provides service in Internet, e.g. CISCO, DELL, for OAM convenience, they shall still reply on the centralized Internet server infrastructure to provide e-commerce service, i.e. rent a Server and related Host service, rather than use P2P network with Zero cost, because current P2P infrastructure still need hugh improvement in Performance.
But P2P still has hugh potential.
Look back at the history, we see, the biggest advantages for P2P is personal service in World. With P2P, the service provided by a personal can be reached to the height that anyone never image.
The best example is BT. Its killer point is: with it, a personel can provide Shared file service for total world with high capability and performance. That is totally different from the service provided in power-law network. It matches the services that require high network bandwidth are: big file download, video streaming, virtual reality, gaming, etc, which is currently hotest development area for P2P. We see this trend in both China and World.
P2P break the power-law network, which is required by current unstopped trend of Personality Service. Nature's trend is forming power-law network, i.e. the service and connection is provided centrally, which is economical. But on the other side, the trend of Personality is also unstopped. We has seen Blog has replaced BBS. Blog, Podcast, personlay shop on eBay is the hostest topic now.
For Personal Service, the most important tools are P2P and Web2.0.
- P2P is a great tool to break power-law network. P2P give End User the capablility to communicate E2E.
- Web2.0 provides a Web Service programming Enivornment. End User can access these Web service by API to provide presonalized service, e.g. Google Map API, Amazon API, etc.
Now, most of personalized service are provided in free Host Internet Service, which provides convenience and also limit the capability of Personal Service. Google has done much. We thanks for Google, but there maybe a end. Economically, if one want to deploy real service, money must be paid for the centralized web infrastructure.
The final solution is the distributed web service infrastructure. P2P play a Key role here. In this infrastructure, data and service shall saved and provided locally, the access shall be provided with P2P. Most current personal web service, e.g. Froogle, Blog, Google Base, Answer, File Sharing (Img, Video, Book, Music, Podcast, etc.), Steaming, Local, News, Mail, Talk, Group, Homepage, Event, etc. can be provided with Content Management Software locally with more convenience for content control and publish. Certainly, the precondition is the content/service provider is Always-on in network. And the non-real-time content/service can also be published and distributed in network by cooperation of Peers, then be accessed with better performance.
For reach this goal, what we need to do?
- Firstly, the content/service publish tool, e.g. Open-Souce Content Management software. It is output is SWDL, RSS, Tag, etc. The output may includes:
- Service Type: Froogle, Blogger, Answer, File Sharing (Img, Video, Book, Music, Podcast, etc.), Steaming, Local, News, Mail, Talk, Group, Homepage, Event, etc.
- Service Key: Title, Name, Location, Area, Category, Owner, etc.
- Secondary, the P2P infrastructure. Manage the node Join, Leave, Service publish, backout, cooperation, etc.
- Thirdly, the content/service retrieve tool, e.g. a Google to find content/service according to SWDL, RSS, Tag, etc. Includes: search, category, directory, earth, maps.
- Forthly, intrinsic network infrastructure application support, e.g. Paypal, analytics, adsense, adwords, etc.
The key is the P2P infrastructure. And the core of this infrastructure is Cooperation.
Current most P2P Research focus on the algorithm for following aspects:
- Network establish and maintainence. e.g. node join, leave. The challenge is believed to be the Churn of the network.
- Content publish and retrieve. The basic of the research is the key-based routing. The challenge is believed to be the performance under P2P Global distribution. Publish is like CDN, retrieve try to utilize Proximity.
We need note:
- Search with multiple key words is essential. Search by regular expression is important if the system want to be used by end user. Let's assumpt tt can be workarounded by preprocessing to convert to multiple key words, so at least, search with multiple key words should be supported and it is essential.
- User Performance is essential. It is the most key point that the system shall be used by end user. BT is an excellent example. We can obtain many exprience from it to improve the performance of P2P. This is an key research and enginneering area. Skype is not an accident. It use Super-node to improve the performance, although this method is not so clever and also is not fully distributted. CDN, Node cache, Multiple Peers Cooperation is another three methods to improve performance.
- Incentives for Cooperation must be built into infrastructure. This is the valuable experience from BT.
- Near-Optimization is OK. The CPU and Memory capacity of terminal and network bandwidth now can make them lower priority. This is also the reason why Kad is so emphasized by P2P developer now, and other algorithm is not.
Some feeling:
- Security: The security problem shall mostly lie in application level. e.g. SPAM. The swindler shall not disappear in virtal world, but it is the task of the end user to find them. In fact, end user has enough experience in current Internet world. The transaction and lower level security is also not different from current Internet world.
- Network Churn. This problem may be a little relaxed when each one has one Always-On network link, just like currently all web server should be always online to provide service. I believe the Always-On is a not-so-far goal. At least, now in China, the cost is about $15/Month by ADSL or LAN. The 3G cost is still unknown. If focus more on this problem, the history in Ad Hoc shall occurs again. In fact, the most important application in Ad Hoc now is: establish the network on site quickly, rather than deal with the network churn.
[Reference]
- Paul Harrison mentioned the concept of "decentralized internet service" and some examples. The interesting PDF slides available here.
Technorati : P2P DHT
Del.icio.us : p2p
标签: 未分类