## A new way to compute

I read an article the other day on using using random pulses rather than digital numbers to compute with, see Computing with random pulses promises to simplify circuitry and save power, in IEEE Spectrum. Essentially they encode a number as a probability in a random string of bits and then use simple logic to compute with. This approach was invented in the early days of digital logic and was called stochastic computing.

# Stochastic numbers?

It’s pretty easy to understand how such logic can work for fractions. For example to represent 1/4, you would construct a bit stream that had one out of every four bits, on average, as a 1 and the rest 0’s. This could easily be a random string of bits which have an average of 1 out of every 4 bits as a one.

A nice result of such a numerical representation is that it easily results in more precision as you increase the length of the bit stream. The paper calls this progressive precision.

Progressive precision helps stochastic computing be more fault tolerant than standard digital logic. That is, if the string has one bit changed it’s not going to make that much of a difference from the original string and computing with an erroneous number like this will probably result in similar results to the correct number.  To have anything like this in digital computation requires parity bits, ECC, CRC and other error correction mechanisms and the logic required to implement these is extensive.

# Stochastic computing

Another advantage of stochastic computation and using a probability  rather than binary (or decimal) digital representation, is that most arithmetic functions are much simpler to implement.

They discuss two examples in the original paper:

• Multiplication – Multiplying two probabilistic bit streams together is as simple as ANDing the two strings.

• Addition – Adding two probabilistic bit strings together just requires a multiplexer, but you end up with a bit string that is the sum of the two divided by two.

I see a couple of problems with stochastic computing:,

• How do you represent  an irrational number, such as the square root of 2;
• How do you represent integers or for that matter any value greater than 1.0 in a probabilistic bit stream; and
• How do you represent negative values in a bit stream.

I suppose irrational numbers could be represented by taking a near-by, close approximation of the irrational number. For instance, using 1.4 for the square root of two, or 1.41, or 1.414, …. And this way you could get whatever (progressive) precision that was needed.

As for integers greater than 1.0, perhaps they could use a floating point representation, with two defined bit strings, one representing the mantissa (fractional part) and the other an exponent. We would assume that the exponent rather than being a probability from 0..1.0, would be inverted and represent 1.0…∞.

Negative numbers are a different problem. One way to supply negative numbers is to use something akin to complemetary representation. For example, rather than the probabilistic bit stream representing 0.0 to 1.0 have it represent -0.5 to 0.5. Then progressive precision would work for negative numbers as well a positive numbers.

One major downside to stochastic numbers and computation is that high precision arithmetic is very difficult to achieve.  To perform 32 bit precision arithmetic would require a bit streams that were  2³² bits long. 64 bit precision would require streams that were  2**64th bits long.

# Good uses for stochastic computing

One advantage of simplified logic used in stochastic computing is it needs a lot less power to compute. One example in the paper they use for stochastic computers is as a retinal sensor for in the body visual augmentation. They developed a neural net that did edge detection that used a stochastic front end to simplify the logic and cut down on power requirements.

Other areas where stochastic computing might help is for IoT applications. There’s been a lot of interest in IoT sensors being embedded in streets, parking lots, buildings, bridges, trucks, cars etc. Most have a need to perform a modest amount of edge computing and then send information up to the cloud or some edge consolidator intermediate

Many of these embedded devices lack access to power, so they will need to make do with whatever they can find.  One approach is to siphon power from ambient radio (see this  Electricity harvesting… article), temperature differences (see this MIT … power from daily temperature swings article), footsteps (see Pavegen) or other mechanisms.

The other use for stochastic computing is to mimic the brain. It appears that the brain encodes information in pulses of electric potential. Computation in the brain happens across exhibitory and inhibitory circuits that all seem to interact together.  Stochastic computing might be an effective way, low power way to simulate the brain at a much finer granularity than what’s available today using standard digital computation.

~~~~

Not sure it’s all there yet, but there’s definitely some advantages to stochastic computing. I could see it being especially useful for in body sensors and many IoT devices.

Photo Credit(s):  The logic of random pulses

2 bit by 2 bit multiplier, By Sodaboy1138 (talk) (Uploads) – Own work, CC BY-SA 3.0, wikimedia

AND ANSI Labelled, By Inductiveload – Own work, Public Domain, wikimedia

2 Input multiplexor

A battery free implantable neural sensor, MIT Technology Review article

Integrating neural signal and embedded system for controlling a small motor, an IntechOpen article

## Blockchain, open source and trusted data lead to better SDG impacts

Read an article today in Bitcoin magazine IXO Foundation: A blockchain based response to UN call for [better] data which discusses how the UN can use blockchains to improve their development projects.

The UN introduced the 17 Global Goals for Sustainable Development (SDG) to be achieved in the world by 2030. The previous 8 Millennial Development Goals (MDG) expire this year.

Although significant progress has been made on the MDGs, one ongoing determent to  MDG attainment has been that progress has been very uneven, “with the poorest and economically disadvantaged often bypassed”.  (See WEF, What are Sustainable Development Goals).

Throughout the UN 17 SDG, the underlying objective is to end global poverty  in a sustainable way.

## Impact claims

In the past organizations performing services for the UN under the MDG mandate, indicated they were performing work toward the goals by stating, for example, that they planted 1K acres of trees, taught 2K underage children or distributed 20 tons of food aid.

The problem with such organizational claims is they were left mostly unverified. So the UN, NGOs and other charities funding these projects were dependent on trusting the delivering organization to tell the truth about what they were doing on the ground.

However, impact claims such as these can be independently validated and by doing so the UN and other funding agencies can determine if their money is being spent properly.

## Proving impact

Proofs of Impact Claims can be done by an automated bot, an independent evaluator or some combination of the two . For instance, a bot could be used to analyze periodic satellite imagery to determine whether 1K acres of trees were actually planted or not; an independent evaluator can determine if 2K students are attending class or not, and both bots and evaluators can determine if 20 tons of food aid has been distributed or not.

Such Proofs of Impact Claims then become a important check on what organizations performing services are actually doing.  With over \$1T spent every year on UN’s SDG activities, understanding which organizations actually perform the work and which don’t is a major step towards optimizing the SDG process. But for Impact Claims and Proofs of Impact Claims to provide such feedback but they must be adequately traced back to identified parties, certified as trustworthy and be widely available.

## The ixo Foundation

The ixo Foundation is using open source, smart contract blockchains, personalized data privacy, and other technologies in the ixo Protocol for UN and other organizations to use to manage and provide trustworthy data on SDG projects from start to completion.

Trustworthy data seems a great application for blockchain technology. Blockchains have a number of features used to create trusted data:

1. Any impact claim and proofs of impacts become inherently immutable, once entered into a blockchain.
2. All parties to a project, funders, services and evaluators can be clearly identified and traced using the blockchain public key infrastructure.
3. Any data can be stored in a blockchain. So, any satellite imagery used, the automated analysis bot/program used, as well as any derived analysis result could all be stored in an intelligent blockchain.
4. Blockchain data is inherently widely available and distributed, in fact, blockchain data needs to be widely distributed in order to work properly.

## The ixo Protocol

The ixo Protocol is a method to manage (SDG) Impact projects. It starts with 3 main participants: funding agencies, service agents and evaluation agents.

• Funding agencies create and digitally sign new Impact Projects with pre-defined criteria to identify appropriate service  agencies which can do the work of the project and evaluation agencies which can evaluate the work being performed. Funding agencies also identify Impact Claim Template(s) for the project which identify standard ways to assess whether the project is being performed properly used by service agencies doing the work. Funding agencies also specify the evaluation criteria used by evaluation agencies to validate claims.
• Service agencies select among the open Impact Projects whichever ones they want to perform.  As the service agencies perform the work, impact claims are created according to templates defined by funders, digitally signed, recorded and collected into an Impact Claim Set underthe IXO protocol.  For example Impact Claims could be barcode scans off of food being distributed which are digitally signed by the servicing agent and agency. Impact claims can be constructed to not hold personal identification data but still cryptographically identify the appropriate parties performing the work.
• Evaluation agencies then take the impact claim set and perform the  evaluation process as specified by funding agencies. The evaluation insures that the Impact Claims reflect that the work is being done correctly and that the Impact Project is being executed properly. Impact claim evaluations are also digitally signed by the evaluation agency and agent(s), recorded and widely distributed.

The Impact Project definition, Impact Claim Templates, Impact Claim sets, Impact Claim Evaluations are all available worldwide, in an Global Impact Ledger and accessible to any and all funding agencies, service agencies and evaluation agencies.  At project completion, funding agencies should now have a granular record of all claims made by service agency’s agents for the project and what the evaluation agency says was actually done or not.

Such information can then be used to guide the next round of Impact Project awards to further advance the UN SDGs.

## Ambly project

The Ambly Project is using the ixo Protocol to supply childhood education to underprivileged children in South Africa.

It combines mobile apps with blockchain smart contracts to replace an existing paper based school attendance system.

The mobile app is used to record attendance each day which creates an impact claim which can then be validated by evaluators to insure children are being educated and properly attending class.

~~~

Blockchains have the potential to revolutionize financial services, provide supply chain provenance (e.g., diamonds with Blockchains at IBM), validate company to company contracts (Ethereum enters the enterprise) and now improve UN SDG attainment.

Welcome to the new blockchain world.

Photo Credit(s): What are Sustainable Development Goals, World Economic Forum;

IXO Foundation website

Ambly Project webpage

Read an article in Stanford Research, Crowdsourced research gives experience to global participants that discussed an activity in Stanford and other top tier research institutions to try to get global participation in academic research. The process is discussed more fully in a scientific paper (PDF here) by researchers from Stanford, MIT Media Lab, Cornell Tech and UC Santa Cruz.

They chose three projects:

• A HCI (human computer interaction) project to design, engineer and build a new paid crowd sourcing marketplace (like Amazon’s Mechanical Turk).
• A visual image recognition project to improve on current visual classification techniques/algorithms.
• A data science project to design and build the world’s largest wisdom of the crowds experiment.

The intent of crowdsourced research is to provide top tier academic research experience to persons which have no access to top research organizations.

Participating universities obtain more technically diverse researchers, larger research teams, larger research projects, and a geographically dispersed research community.

Collaborators win valuable academic research experience, research community contacts, and potential authorship of research papers as well as potential recommendation letters (for future work or academic placement),

## How does crowdresearch work?

It’s almost an open source and agile development applied to academic research. The work week starts with the principal investigator (PI) and research assistants (RAs) going over last week’s milestone deliveries to see which to pursue further next week. The crowdresearch uses a REDDIT like posting and up/down voting to determine which milestone deliverables are most important. The PI and RAs review this prioritized list to select a few to continue to investigate over the next week.

The PI holds an hour long video conference (using Google Hangouts On Air Youtube live stream service). On the conference call all collaborators can view the stream but only a select few are on camera. The PI and the researchers responsible for the important milestone research of the past week discuss their findings and the rest of the collaborators on the team can participate over Slack. The video conference is archived and available  to be watched offline.

At the end of the meeting, the PI identifies next weeks milestones and potentially directly responsible investigators (DRIs) to work on them.

The DRIs and other collaborators choose how to apportion the work for the next week and work commences. Collaboration can be fostered and monitored via Slack and if necessary, more Google live stream meetings.

If collaborators need help understanding some technology, technique, or too, the PI, RAs or DRIs can provide a mini video course on the topic or can point to other information used to get the researchers up to speed. Collaborators can ask questions and receive answers through Slack.

When it’s time to write the paper, they used Google Docs with change tracking to manage the writing process.

The team also maintained a Wiki on the overall project to help new and current members get up to speed on what’s going on. The Wiki would also list the week’s milestones, video archives, project history/information, milestone deliverables, etc.

At the end of the week, researchers and DRIs would supply a mini post to describe their work and link to their milestone deliverables so that everyone could review their results.

## Who gets credit for crowdresearch?

Each week, everyone on the project is allocated 100 credits and apportions these credits to other participants the weeks activities. The credits are  used to drive a page-rank credit assignment algorithm to determine an aggregate credit score for each researcher on the project.

Check out the paper linked above for more information on the credit algorithm. They tried to defeat (credit) link rings and other obvious approaches to stealing credit.

At the end of the project, the PI, DRIs and RAs determine a credit clip level for paper authorship. Paper authors are listed in credit order and the remaining, non-author collaborators are listed in an acknowledgements section of the paper.

The PIs can also use the credit level to determine how much of a recommendation letter to provide for researchers

## Tools for crowdresearch

The tools needed to collaborate on crowdresearch are cheap and readily available to anyone.

• Google Docs, Hangouts, Gmail are all freely available, although you may need to purchase more Drive space to host the work on the project.
• Wiki software is freely available as well from multiple sources including Wikipedia (MediaWiki).
• Slack is readily available for a low cost, but other open source alternatives exist, if that’s a problem.
• Github code repository is also readily available for a reasonable cost but  there may be alternatives that use Google Drive storage for the repo.
• Web hosting is needed to host the online Wiki, media and other assets.

Initial projects were chosen in computer science, so outside of the above tools, they could depend on open source. Other projects will need to consider how much experimental apparatus, how to fund these apparatus purchases, and how a global researchers can best make use of these.

## My crowdresearch projects

Some potential commercial crowdresearch projects where we could use aggregate credit score and perhaps other measures of participation to apportion revenue, if any.

• NVMe storage system using a light weight storage server supporting NVMe over fabric access to hybrid NVMe SSD – capacity disk storage.
• Proof of Stake (PoS) Ethereum pooling software using Linux servers to create a pool for PoS ETH mining.
• Bipedal, dual armed, dual handed, five-fingered assisted care robot to supply assistance and care to elders and disabled people throughout the world.

Non-commercial projects, where we would use aggregate credit score to apportion attribution and any potential remuneration.

• A fully (100%?) mechanical rover able to survive, rove around, perform  scientific analysis, receive/transmit data and possibly, effect repairs from within extreme environments such as the surface of Venus, Jupiter and Chernoble/Fukishima Daiichi reactor cores.
• Zero propellent interplanetary tug able to rapidly transport rovers, satellites, probes, etc. to any place within the solar system and deploy theme properly.
• A Venusian manned base habitat including the design, build process and ongoing support for the initial habitat and any expansion over time, such that the habitat can last 25 years.

Any collaborators across the world, interested in collaborating on any of these projects, do let me know, here via comments. Please supply some way to contact you and any skills you’re interested in developing or already have that can help the project(s).

I would be glad to take on PI role for the most popular project(s), if I get sufficient response (no idea what this would be). And  I’d be happy to purchase the Drive, GitHub, Slack and web hosting accounts needed to startup and continue to fruition the most popular project(s). And if there’s any, more domain experienced PIs interested in taking any of these projects do let me know.

Picture Credit(s): Crowd by Espen Sundve;

## A tale of two storage companies – NetApp and Vantara (HDS-Insight Grp-Pentaho)

It was the worst of times. The industry changes had been gathering for a decade almost and by this time were starting to hurt.

The cloud was taking over all new business and some of the old. Flash’s performance was making high performance easy and reducing storage requirements commensurately. Software defined was displacing low and midrange storage, which was fine for margins but injurious to revenues.

Both companies had user events in Vegas the last month, NetApp Insight 2017 last week and Hitachi NEXT2017 conference two weeks ago.

As both companies respond to industry trends, they provide an interesting comparison to watch companies in transition.

## Company role

• NetApp’s underlying theme is to change the world with data and they want to change to help companies do this.
• Vantara’s philosophy is data and processing is ultimately moving into the Internet of things (IoT) and they want to be wherever the data takes them.

Hitachi Vantara is a brand new company that combines Hitachi Data Systems, Hitachi Insight Group and Pentaho (an analytics acquisition) into one organization to go after the IoT market. Pentaho will continue as a separate brand/subsidiary, but HDS and Insight Group cease to exist as separate companies/subsidiaries and are now inside Vantara.

NetApp sees transitions occurring in the way IT conducts business but ultimately, a continuing and ongoing role for IT. NetApp’s ultimate role is as a data service provider to IT.

## Customer problem

• Vantara believes the main customer issue is the need to digitize the business. Because competition is emerging everywhere, the only way for a company to succeed against this interminable onslaught is to digitize everything. That is digitize your manufacturing/service production, sales, marketing, maintenance, any and all customer touch points, across your whole value chain and do it as rapidly as possible. If you don’t your competition will.
• NetApp sees customers today have three potential concerns: 1) how to modernize current infrastructure; 2) how to take advantage of (hybrid) cloud; and 3) how to build out the next generation data center. Modernization is needed to free capital and expense from traditional IT for use in Hybrid cloud and next generation data centers. Most organizations have all three going on concurrently.

Vantara sees the threat of startups, regional operators and more advanced digitized competitors as existential for today’s companies. The only way to keep your business alive under these onslaughts is to optimize your value delivery. And to do that, you have to digitize every step in that path.

NetApp views the threat to IT as originating from LoB/shadow IT originating applications born and grown in the cloud or other groups creating next gen applications using capabilities outside of IT.

## Product direction

• NetApp is looking mostly towards the cloud. At their conference they announced a new Azure NFS service powered by NetApp. They already had Cloud ONTAP and NPS, both current cloud offerings, a software defined storage in the cloud and a co-lo hardware offering directly attached to public cloud (Azure & AWS), respectively.
• Vantara is looking towards IoT. At their conference they announced Lumada 2.0, an Industrial IoT (IIoT) product framework using plenty of Hitachi software functionality and intended to bring data and analytics under one software umbrella.

NetApp is following a path laid down years past when they devised the data fabric. Now, they are integrating and implementing data fabric across their whole product line. With the ultimate goal that wherever your data goes, the data fabric will be there to help you with it.

Vantara is broadening their focus, from IT products and solutions to IoT. It’s not so much an abandoning present day IT, as looking forward to the day where present day IT is just one cog in an ever expanding, completely integrated digital entity which the new organization becomes.

They both had other announcements, NetApp announced ONTAP 9.3, Active IQ (AI applied to predictive service) and FlexPod SF ([H]CI with SolidFire storage) and Vantara announced a new IoT turnkey appliance running Lumada and a smart data center (IoT) solution.

## Who’s right?

They both are.

Digitization is the future, the sooner organizations realize and embrace this, the better for their long term health. Digitization will happen with or without organizations and when it does, it will result in a significant re-ordering of today’s competitive landscape. IoT is one component of organizational digitization, specifically outside of IT data centers, but using IT resources.

In the mean time, IT must become more effective and efficient. This means it has to modernize to free up resources to support (hybrid) cloud applications and supply the infrastructure needed for next gen applications.

One could argue that Vantara is positioning themselves for the long term and NetApp is positioning themselves for the short term. But that denies the possibility that IT will have a role in digitization. In the end both are correct and both can succeed if they deliver on their promise.

## Zipline delivers blood 7X24 using fixed wing drones in Rwanda

Read an article the other day in MIT Tech Review (Zipline’s ambitious medical drone delivery in Africa) about a startup in Silicon Valley, Zipline, that has started delivering blood by drones to remote medical centers in Rwanda.

We’ve talked about drones before (see my Drones as a leapfrog technology post) and how they could be another leapfrog 3rd world countries into the 21st century. Similar, to cell phones, drones could be used to advance infrastructure without having to go replicate the same paths as 1st world countries such as building roads/hiways, trains and other transport infrastructure.

# The country

Rwanda is a very hilly but small (10.2K SqMi/26.3 SqKm) and populous (pop. 11.3m) country in east-central Africa, just a few degrees south of the Equator. Rwanda’s economy is based on subsistence agriculture with a growing eco-tourism segment.

Nonetheless, with all
its hills and poverty roads in Rwanda are not the best. In the past delivering blood supplies to remote health centers could often take hours or more. But with the new Zipline drone delivery service technicians can order up blood products with an app on a smart phone and have it delivered via parachute to their center within 20 minutes.

## Drone delivery operations

In the nest, a center for drone operations, there is a tent housing the blood supplies, and logistics for the drone force. Beside the tent are a steel runway/catapults that can launch drones and on the other side of the tent are brown inflatable pillows  used to land the drones.

The drones take a pre-planned path to the remote health centers and drop their cargo via parachute to within a five meter diameter circle.

Operators fly the drones using an iPad and each drone has an internal navigation system. Drones fly a pre-planned flightaugmented with realtime kinematic satellite navigation. Drone travel is integrated within Rwanda’s controlled air space. Routes are pre-mapped using detailed ground surveys.

## Drone delivery works

Zipline drone blood deliveries have been taking place since late 2016. Deliveries started M-F, during daylight only. But by April, they were delivering 7 days a week, day and night.

Zipline currently only operates in Rwanda and only delivers blood but they have plans to extend deliveries to other medical products and to expand beyond Rwanda.

On their website they stated that before Zipline, delivering blood to one health center would take four hours by truck which can now be done in 17 minutes. Their Muhanga drone center serves 21 medical centers throughout western Rwanda.

Photo Credits: Flyzipline.com

## The fragility of public cloud IT

I have been reading AntiFragile again (by Nassim Taleb). And although he would probably disagree with my use of his concepts, it appears to me that IT is becoming more fragile, not less.

For example, recent outages at major public cloud providers display increased fragility for IT. Yet these problems, although almost national in scope, seldom deter individual organizations from their migration to the cloud.

## Tragedy of the cloud commons

The issues are somewhat similar to the tragedy of the commons. When more and more entities use a common pool of resources, occasionally that common pool can become degraded. But because no-one really owns the common resources no one has any incentive to improve the situation.

Now the public cloud, although certainly a common pool of resources, is also most assuredly owned by corporations. So it’s not a true tragedy of the commons problem. Public cloud corporations have a real incentive to improve their services.

However, the fragility of IT in general, the web, and other electronic/data services all increases as they become more and more reliant on public cloud, common infrastructure. And I would propose this general IT fragility is really not owned by any one person, corporation or organization, let alone the public cloud providers.

## Pre-cloud was less fragile, post-cloud more so

In the old days of last century, pre-cloud, if a human screwed up a CLI command the worst they could happen was to take out a corporation’s data services. Nowadays, post-cloud, if a similar human screws up a CLI command, the worst that can happen is that major portions of the internet services of a nation go down.

Yes, over time, public cloud services have become better at not causing outages, but they aren’t going away. And if anything, better public cloud services just encourages more corporations to use them for more data services, causing any subsequent cloud outage to be more impactful, not less

The Internet was originally designed by DARPA to be more resilient to failures, outages and nuclear attack. But by centralizing IT infrastructure onto public cloud common infrastructure, we are reversing the web’s inherent fault tolerance and causing IT to be more susceptible to failures.

## What can be done?

There are certainly things that can be done to improve the situation and make IT less fragile in the short and long run:

1. Use the cloud for non-essential or temporary data services, that don’t hurt a corporation, organization or nation when outages occur.
2. Build in fault-tolerance, automatic switchover for public cloud data services to other regions/clouds.
3. Physically partition public cloud infrastructure into more regions and physically separate infrastructure segments within regions, such that any one admin has limited control over an amount of public cloud infrastructure.
4. Divide an organizations or nations data services across public cloud infrastructures, across as many regions and segments as possible.
5. Create a National Public IT Safety Board, not unlike the one for transportation, that does a formal post-mortem of every public cloud outage, proposes fixes, and enforces fix compliance.

## The National Public IT Safety Board

The National Transportation Safety Board (NTSB) has worked well for air transportation. It relies on the cooperation of multiple equipment vendors, airlines, countries and other parties. It performs formal post mortems on any air transportation failure. It also enforces any fixes in processes, procedures, training and any other activities on equipment vendors, maintenance services, pilots, airlines and other entities that can impact public air transport safety. At the moment, air transport is probably the safest form of transportation available, and much of this is due to the NTSB

We need something similar for public (cloud) IT services. Yes most public cloud companies are doing this sort of work themselves in isolation, but we have a pressing need to accelerate this process across cloud vendors to improve public IT reliability even faster.

The public cloud is here to stay and if anything will become more encompassing, running more and more of the worlds IT. And as IoT, AI and automation becomes more pervasive, data processes that support these services, which will, no doubt run in the cloud, can impact public safety. Just think of what would happen in the future if an outage occurred in a major cloud provider running the backend for self-guided car algorithms during rush hour.

If the public cloud is to remain (at this point almost inevitable) then the safety and continuous functioning of this infrastructure becomes a public concern. As such, having a National Public IT Safety Board seems like the only way to have some entity own IT’s increased fragility due to  public cloud infrastructure consolidation.

~~~~

In the meantime, as corporations, government and other entities contemplate migrating data services to the cloud, they should consider the broader impact they are having on the reliability of public IT. When public cloud outages occur, all organizations suffer from the reduced public perception of IT service reliability.

Photo Credits: Fragile by Bart Everson; Fragile Planet by Dave Ginsberg; Strange Clouds by Michael Roper

## Ethereum enters the enterprise

Read an article the other day on NYT (Business Giants Announce Creation of … Ethereum).

In case you don’t know Ethereum is a open source, block chain solution that’s different than the software behind Bitcoin and IBM’s Hyperledger (for more on Hyperledger see our Blockchains at IBM post or our GreyBeardsOnStorage podcast with Donna Dillinger, IBM Fellow).

Blockchains are a software based, permanent ledger which can be used to record anything. Bitcoin, Ethereum and Hyperledger are all based on blockchains that provide similar digital information services with varying security, programability, consensus characteristics, etc.

Blockchains represent an entirely new way of doing business in the digital world and have the potential to take over many financial services  and other contracting activities that are done today between organizations.

Blockchain services provide the decentralized recording of transactions into an immutable ledger.  The decentralized nature of blockchains makes it difficult (if not impossible) to game the system to record an invalid transaction.

## Miners

Ethereum supports an Ethereum Virtual Machine (EVM) application which offers customers and users a more programmable blockchain. That is rather than just updating accounts with monetary transactions like Bitcoin does, one can implement specialized transaction processing for updating the immutable ledger. It’s this programability that allows for the creation of “smart contracts” which can be programmatically verified and executed.

Ethereum miner nodes are responsible for validating transactions and the state transition(s) that update the ledger. Transactions are grouped in blocks by miners.

Miners are responsible for validating the transaction block and performing a hard mathematical computation or proof of work (PoW) which goes along used to validate the block of transactions. Once the PoW computation is complete, the block is packaged up and the miner node updates its database (ledger) and communicates its result to all the other nodes on the network which updates their transaction ledgers as well. This constitutes one state transition of the Ethereum ledger.

Miners that validate Ethereum transactions get paid in Ethers, which are a form of currency throughout the Ethereum ecosystem.

## Blockchain consensus

Ethereum ledger consensus is based on the miner nodes executing the PoW algorithm properly. The current Ethereal PoW algorithm is Ethash, which is an “ASIC resistant” algorithm. What this means is that standard GPUs and (less so) CPUs are already very well optimized to perform this algorithm and any potential ASIC designer, if they could do better, would make more money selling their design to GPU and CPU designers, than trying to game the system.

One problem with Bitcoin is that its PoW is more ASIC friendly, which has led some organizations to developing special purpose ASICs in an attempt to dominate Bitcoin mining. If they can dominate Bitcoin mining, this can  be used to game the Bitcoin consensus system and potentially implement invalid transactions.

## Ethereum Accounts

Ethereum has two types of accounts:

• Contract accounts that are controlled by the EVM application code, or
• Externally owned accounts (EOA) that are controlled by a set of private keys and represent external agents (miner nodes, people, transaction generating entities)

Contract accounts really are code and data which constitute the EVM bytecode (application). Contract account bytecode is also stored on the Ethereum ledger (when deployed?) and are associated with an EOA that initiates the Contract account.

Contract functionality is written in Solidity, Serpent, Lisp Like Language (LLL) or other languages that can be compiled into EVM bytecode. Smart contracts use Ethereum Contract accounts to validate and execute contract actions.

## Ethereum gas pricing

As EVMs contract accounts can consume arbitrary amounts of computation, bandwidth and storage to process transactions,   Ethereum uses a concept called “gas” to pay for their resource consumption.

When a contract account transaction is initiated, it identifies a gas price (in Ethers) and a maximum gas amount that it is willing to consume to process the transaction.

When a contract transaction takes place:

• If the maximum gas amount is less than what the transaction consumes, then the transaction is executed and is applied to the ledger. Any left over or remaining gas Ethers is credited back to the EOA.
• If the maximum gas amount is not enough to execute the transaction, then the transaction fails and no update occurs.

## Enterprise Ethereum Alliance

What’s new to Ethereum is that Accenture, Bank of New York Mellon, BP, CreditSuisse, Intel, Microsoft, JP Morgan, UBS and many others have joined together to form the Enterprise Ethereum Alliance. The alliance intends to work to create a standard version of the Ethereum software that enterprise companies can use to manage smart contracts.

Microsoft has had a Azure Blockchain-as-a-Service online since 2015.  This was based on an earlier version of Ethereum called Project Bletchley.

Ethereum seems to be an alternative to IBM Hyperledger, which offers another enterprise class block chain for smart contracts. As enterprise class blockchains look like they will transform the way companies do business in the future, having multiple enterprise class blockchain solutions seems smart to many companies.

Photo Credit(s): Miner by Mark Callahan; Gas prices by Corpsman.com; File: Ether pharmecie.jpg by Wikimedia

## A college course on identifying BS

Read an article the other day from Recode (These University of Washington professors teaching a course on Calling BS) that seems very timely. The syllabus is online (Calling Bullshit — Syllabus) and it looks like a great start on identifying falsehood wherever it can be found.

## In the beginning, what’s BS?

The course syllabus starts out referencing Brandolini’s Bullshit Asymmetry Principal (Law): the amount of energy needed to refute BS is an order of magnitude bigger than to produce it.

Then it goes into a rather lengthy definition of BS from Harry Frankfort’s 1986 On Bullshit article. In sum, it starts out reviewing a previous author’s discussions on Humbug and ends up at the OED. Suffice it to say Frankfurt’s description of BS runs the gamut from: Deceptive misrepresentation to short of lying.

They course syllabus goes on to reference two lengthy discussions/comments on Frankfurt’s seminal On Bullshit article, but both Cohen’s response, Deeper into BS and Eubank & Schaeffer’s A kind word for BS: …  are focused more on academic research rather than everyday life and news.

## How to mathematically test for BS

The course then goes into mathematical tests for BS that range from Fermi’s questions, the Grim Test and Benford’s 1936 Law of Anomalous Numbers. These tests are all ways of looking at data and numbers and estimating whether they are bogus or not. Benford’s paper/book talks about how the first page of logarithms is always more used than others because numbers that start with 1 are more frequent than any other number.

## How rumors propagate

The next section of the course (week 4) talks about the natural ecology of BS.

Here there’s reference to an article by Friggeri, et al, on Rumor Cascades, which discusses the frequency with which patently both true, false and partially true/partially false rumors are “shared” on social media (Facebook).

The professors look at a website called Snopes.com which evaluates the veracity of publishes rumors uses this to classify the veracity of rumors. Next they examine how these rumors are shared over time on Facebook.

Summarizing their research, both false and true rumors propagate sporadically on Facebook. But even verified false or mixed true/mixed false rumors (identified by Snopes.com) continue to propagate on Facebook. This seems to indicate that rumor sharers are ignoring the rumor’s truthfulness or are just unaware of the Snopes.com assessment of the rumor.

## Other topics on calling BS

The course syllabus goes on to causality (correlation is not causation, a common misconception used in BS), statistical traps and trickery (used to create BS), data visualization (which can be used to hide BS), big data (GiGo leads to BS), publication bias (e.g., most published research presents positive results, where’s all the negative results research…), predatory publishing and scientific misconduct (organizations that work to create BS for others), the ethics of calling BS (the line between criticism and harassment), fake news and refuting BS.

## Fake news

The section on Fake News is very interesting. They reference an article in the NYT, The Agency about how a group in Russia have been reaping havoc across the internet with fake news and bogus news sites.

But there’s more another article on NYT website, Inside a fake news sausage factory, details how multiple websites started publishing bogus news and then used advertisement revenue to tell them which bogus news generated more ad revenue – apparently there’s money to be made in advertising fake news. (Sigh, probably explains why I can’t seem to get any sponsors for my websites…).

## Improving the course

How to improve their course? I’d certainly take a look at what Facebook and others are doing to identify BS/fake news and see if these are working effectively.

Another area to add might be a historical review of fake rumors, news or information. This is not a new phenomenon. It’s been going on since time began.

In addition, there’s little discussion of the consequences of BS on life, politics, war, etc. The world has been irrevocably changed in the past  on account of false information. Knowing how bad this has been this might lend some urgency to studying how to better identify BS.

There’s a lot of focus on Academia in the course and although this is no doubt needed, most people need to understand whether the news they see every day is fake or not. Focusing more on this would be worthwhile.

~~~~

I admire the University of Washington professors putting this course together. It’s really something that everyone needs to understand  nowadays.

They say the lectures will be recorded and published online – good for them. Also, the current course syllabus is for a one credit hour course but they would like to expand it to a three to four credit hour course – another great idea