What would happen if somebody came up with a peer-to-peer cloud (P2P-Cloud) storage or computing service. I see this as
- Operating a little like Napster/Gnutella where many people come together and share out their storage/computing resources.
- It could operate in a centralized or decentralized fashion
- It would allow access to data/computing resources anywhere from the internet
Everyone joining the P2P-cloud would need to set aside computing and/or storage resources they were willing to devote to the cloud. By doing so, they would gain access to an equivalent amount (minus overhead) of other nodes computing and storage resources to use as they see fit.
For cloud storage the P2P-Cloud would create a common cloud data repository spread across all nodes in the network:
- Data would be distributed across the network in such a way that would allow reconstruction within any reasonable time frame and would handle any reasonable amount of node outages without loss of data.
- Data would be encrypted before being sent to the cloud rendering the data unreadable without the key.
- Data would NOT necessarily be shared, but would be hosted on other users systems.
As such, if I were to offer up 100GB of storage to the P2P-Cloud, I would get at least a 100GB (less overhead) of protected storage elsewhere on the cloud to use as I see fit. Some % of this would be lost to administration say 1-3% and redundancy protection say ~25% but the remaining 72GB of off-site storage could be very useful for DR purposes.
P2P-Cloud storage would provide a reliable, secure, distributed file repository that could be easily accessible from any internet location. At a minimum, the service would be free and equivalent to what someone supplies (less overhead) to the P2P-Cloud Storage service. If storage needs exceeded your commitment, more cloud storage could be provided at a modest cost to the consumer. Such fees would be shared by all the participants offering excess [=offered – (consumed + overhead)] storage to the cloud .
- P2P-Cloud computing suppliers would agree to use something like a “new screensaver” which would perform computation while generating a viable screensaver.
- Whenever the screensaver was invoked, it would start execution on the last assigned processing unit. Intermediate work results would need to be saved and when completed, the answer could be sent to the requester and a new processing unit assigned.
- Processing units would be assigned by the P2P-Cloud computing consumer, would be timeout-able and re-assignable at will.
Computing users won’t gain much if the computing time they consume is <= the computing time they offer (less overhead). However, computing time offset may be worth something, i.e., computing time now might be more valuable than computing time tonite. Which may offer a slight margin of value to help get this off the ground. As such, P2P-Cloud computing suppliers would need to be able to specify when computing resources might be mostly available along with the type, quality and quantity.
Unclear how to secure the processing unit and this makes legal issues more prevalent. That may not be much of a problem, as a complex distributed computing task makes little sense in isolation. But the (il-)legality of some data processing activities could conceivably put the provider in a precarious position. (Somebody from the legal profession would need clarify all this, but I would think that some “Amazon C2” like licensing might offer safe harbor here).
P2P-Cloud computing services wouldn’t necessarily be amenable to the more normal, non-distributed or linear computing tasks but one could view these as just a primitive version of distributed computing tasks. In either case, any data needed for computation would need to be sent along with the computing software to be run on a distributed node. Whether it’s worth the effort is something for the users to debate.
BOINC can provide a useful model here. Also, the Condor(R) project at U. of Wisconsin/Madison can provide a similar framework for scheduling the work of a “less distributed” computing task model. In my mind, both types of services ultimately need to be provided.
To generate more compute servers, the SETI@Home and similar BOINC projects rely on doing good deeds. As such, if you can make your computing task do something of value to most users then maybe that’s enough. In that case, I would suggest joining up as a BOINC project. For the rest of us, doing more mundane data processing, just offering our compute services to the P2P-Cloud will need to suffice.
Starting up the P2P-Cloud
Bootstrapping the P2P-Cloud might take some effort but once going it should be self sustaining (assuming no centralized infrastructure). I envision an open source solution, taking off from the work done on Napster&Gnutella and/or Boinc&Condor.
I believe the P2P-Cloud Storage service would be the easiest to get started. BOINC and SETI@home (list of active Boinc projects) have been around a lot longer than cloud storage but their existence suggests that with the right incentives, even the P2P-Cloud Computing service can make sense.