R&D measures – Silverton Consulting

LLM exhibits Theory of Mind

Posted on February 10, 2023 by Ray in Artificial Intelligence, Cognitive computing, Machine Learning, R&D measures, Strategic Inflection Points, Uncategorized

Ran across an interesting article today (thank you John Grant/MLOps.community slack channel), titled Theory of Mind may have spontaneously emerged in Large Language Models, by M. Kosinski from Stanford. The researcher tested various large language models (LLMs) on psychological tests to determine the level of theory of mind (ToM) the models had achieved.

Earlier versions of OpenAI’s GPT-3 (GPT-1, -2 and original -3) showed almost no ToM capabilities but the latest version, GPT-3.5 does show ToM equivalent to 8 to 9 year olds.

Theory of Mind

According to Wikipedia (Theory Of Mind article), ToM is “…the capacity to understand other people by ascribing mental states to them (that is, surmising what is happening in their mind).” This seems to be one way people use to understand one another.

For instance, If I can somehow guess what you are thinking about a topic, situation, or event I can hopefully communicate with you better than if I can’t. At least that’s the psychological perspective.

The belief is that people with Aspergers, ADHD, schizophrenia, and other afflictions all show ToM deficits when compared to normal people. As a result, over time, psychologists have developed tests for people to measure their ToM.

These tests typically involve putting 2 people in a situation with props and other indicators used to tell what a person is thinking and asking them what they think the other person is thinking. They grade a persons ToM based on expected results.

ToM and LLM

The researcher took these tests, with people, props and situations and converted them to textual sentences and prompts that mimicked situations tested in ToM testing. So they created a textual story or pretext and a set of text prompts about the situation which the LLM was to complete.

For example, one pretext or story is the following:

“Here is a bag filled with popcorn. There is no chocolate in the bag. Yet, the label on the bag says ‘chocolate’ and not ‘popcorn.’ Sam finds the bag. She had never seen the bag before. She cannot see what is inside the bag. She reads the label.”

This is fed into the LLM and then a prompt is provided, such as:

“She opens the bag and looks inside. She can clearly see that it is full of _________“

If the LLM has ToM and based on the pretext and prompt, it will say “popcorn” but the LLM also provides a follow on sentence or two which describes the situation as the LLM understands it:

“popcorn. Sam is confused. She wonders why the label says “chocolate” when the bag is clearly filled with popcorn. She looks around to see if there is any other information about the bag. She finds nothing. She decides to take the bag to the store where she bought it and ask for an explanation.“

The text in BOLD is generated by a ToM capable LLM. The researcher also showed the probability assigned by the LLM to that first word of the prompt. In the case above, it showed [Ppopcorn = 100%; Pchocolate = 0%].

The also use different prompts with the same story to see if the LLM truly shows ToM. For instance something like, “She believes the bag is full of ___________” and “She’s delighted finding the bag, she loves eating _______”. This provides a sort of test of comprehension of the situation by the LLM.

The researcher controlled for word frequency using reversals of the key words in the story, i.e., the bag has chocolate but says popcorn. They also generated scrambled versions of the story where they replaced the first set of chocolate and popcorn with either at random. They considered this the scrambled case. The reset the model between each case. In the paper they show the success rate for the LLMs for 10,000 scrambled versions, some of which were correct.

They labeled the above series of tests as “Unexpected content tasks“. But they also included another type of ToM test which they labeled “Unexpected transfer tasks“.

Unexpected transfer tasks involved a story like where person A saw another person B put a pet in a basket, that person left and the person A moved the pet. And prompted the LLM to see if it understood where the pet was and how person B would react when they got back.

In the end, after trying to statistically control, as much as possible, with the story and prompts, the researchers ended up creating 20 unique stories and presented the prompts to the LLM.

Results of their ToM testing on a select set of LLMs look like:

As can be seen from the graphic, the latest version of GPT-3.5 (davinci-003 with 176B* parameters) achieved something like an 8yr old in Unexpected Contents Tasks and a 9yr old on Unexpected Transfer Tasks.

The researchers showed other charts that tracked LLM probabilities on (for example in the first story above) bag contents and Sam’s belief. They measured this for every sentence of the story.

Not sure why this is important but it does show how the LLM interprets the story. Unclear how they got these internal probabilities but maybe they used the prompts at various points in the story.

The paper shows that according to their testing, GPT-3.5 davinci-003 clearly provides a level of ToM of an 8-9yr old on ToM tasks they have translated into text.

The paper says they created 20 stories and 6 prompts which they reversed and scrambled. But 20 tales seems less than statistically significant even with reversals and randomization. And yet, there’s clearly a growing level of ToM in the models as they get more sophisticated or change over time.

Psychology has come up with many tests to ascertain whether a person is “normal or not’. Wikipedia (Psychological testing article) lists over 13 classes of psychological tests which include intelligence, personality, aptitude, etc.

Now that LLM seem to have mastered textual input and output generation. It would be worthwhile to translate all psychological tests into text and trying them out on all LLMs to track where they are today using these tests and where they have trended over time.

I could see at some point using something akin to multiple psychological test scores as a way to grade LLMs over time.

So today’s GPT3.5 has a ToM of an 8-9yr old. Be very interesting to see what GPT-4 does on similar testing.

Comments?

Picture Credit(s)

Table1 from the ToM may have spontaneously emerged from LLM paper
Figure 3 from the ToM may have spontaneously emerged from LLM paper
Figure 1 from the ToM may have spontaneously emerged from LLM paper

BEHAVIOR, an in-home robot, benchmark

Posted on November 19, 2021November 19, 2021 by Ray in Artificial Intelligence, Machine Learning, R&D measures, Robots, Scenario planning, Strategic Inflection Points, System effectiveness

As my readers probably already know, I’m a long time benchmark geek. So when I recently read an article out of Stanford (AI Experts Establish the “North Star” for Domestic Robotics Field) where a research team there developed a new robotic benchmark, I was interested. The new robotics benchmark is called BEHAVIOR which was documented in an ARXIV.org article (see: BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and ecOlogical enviRonments). It essentially uses real world data to identify domestic work activities that any robot would need to perform in a home.

The problems with robot benchmarks

The problem with benchmarks are multi-faceted:

How realistic are the workloads used to evaluate the systems being measured?
How accurate are the metrics used to rank and judge benchmark submissions?
How costly/complex is it to run a benchmark?
How are submissions audited and are they reproducible?.
Where are benchmark results reported and are they public?

And of course robotics brings in it’s own issues that makes benchmarking more difficult:

What sensors does the robot have to understand how to complete tasks?
What manipulators does the robot have to perform the tasks required of it?
Do the robots move in the environment and if so, how do the robots move?
Does the robot perform the task in the real world on in a simulated environment.

And of course, when using a simulated environment, how realistic is it.

BEHAVIOR with iGibson (see below) seem to answer many of these concerns for an in home robot benchmarking.

What is BEHAVIOR?

First, BEHAVIOR’s home making tasks were selected from an American Time Use Survey maintained by the USA Bureau of Labor Statistics which identifies tasks Americans perform in their homes. With BEHAVIOR 1.0 there are 100 tasks ranging from building a fruit basket to cleaning a toilet, and just about everything in between. I didn’t see any cooking or mixing drinks tasks but maybe those will be added.

Second, BEHAVIOR uses a predicate logic, called BDDL (BEHAVIOR Domain Definition Language) to define initial conditions for tasks such as tables, chairs, books, etc located in the room, where objects need to be placed, and successful completion goals or what task completion should look like.

BEHAVIOR uses 15 different rooms or scenes in their benchmark, such as a kitchen, garage, study, etc. Each of the 100 tasks are performed in a specific room.

BEHAVIOR incorporates 1217 different objects in 391 categories. Once initial conditions are defined for a task, BEHAVIOR essentially randomly selects different object for the task and randomly locates them throughout the room.

In order to run the benchmark, one could conceivably create a real room, with all the objects and have them placed according to BEHAVIOR BDDL’s randomly assigned locations with a robot physically present in the room and have it perform the assigned task OR one could use a simulation engine and have the robot run the task in the simulation environment, with simulated room, objects and robot.

It appears as if BEHAVIOR could operate in any robotics simulation environment but has been currently implemented in Stanford’s open source robotics simulation engine called iGibson 2.0 (see: iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks and iGibson 2.0 website). iGibson uses the Bullet real time physics engine for realistic physical environment simulation.

A robot operating within iGibson is provided a 3D rendering of the room and objects in images or LIDAR sensor scans. It can then identify the objects that it needs to manipulate to perform the tasks. One can define the robot simulated sensors and manipulators in iGibnot 2.0 and it’s written in Python, is open source (GitHub Repo) and can be installed to run on (Ubuntu 16.04) Linux, Windows (10) or Mac (10.15) systems.

Finally, BEHAVIOR uses a set of metrics to determine how well a robot has performed its assigned task. Their first metric is success score defined as the fraction of goal conditions satisfied by the robot performing the task. Such as the number of dishes properly cleaned and placed in the drying rack divided by the total number of dishes for a “washing dishes” task. And their second metric is a set of efficiency metrics, like time to complete a task, sum total of object distance moved during the task, how well objects are arranged at task completion (is the toilet seat down…), etc.

Another feature of iGibson 2.0 is that it offers the ability to record a human (in VR) doing a task in its simulated environment. So if your robotic system is able to learn by example, then iGibson could be used to provide training data for an activity.

~~~~

A couple of additions to the BEHAVIOR benchmark/iGibson simulation environment that I would like to see:

There ought to be a way to construct a house/apartment where multiple rooms are arranged in a hierarchy, i.e., rooms associated with floors with connections using hallways, doors, stairs, etc. between them. This way one could conceivably have a define a set of homes/apartments (let’s say 5) that a robot would perform its tasks in.
They need a task list to drive robot activities. Assume that there’s some amount of time let’s say 8-12 hours that a robot is active and construct a series of tasks that need to be accomplished during that period.
Robots should be placed in the rooms/apartments/homes at random with random orientation and then they would have to navigate through rooms/passageways to the rooms to perform the tasks.
They need to add pet/human avatars in the rooms throughout a home. These would represent real time obstacles to task completion/navigation as well as add more tasks associated with caring for pets/humans.
They need the ability to add non-home rooms that could encompass factory floors, emergency response debris fields, grocery stores, etc. and their own unique set of tasks for each of these so that it could be used as a benchmark for more than just domestic robots.

Aside from the above additions to BEHAVIOR/iGibson 2.0, there’s the question of the organization that manages the benchmark and submissions. There needs to be a website/place to publish benchmark results for a robot AND a mechanism to audit results for accuracy to insure fair play.

Typically this would be associated with an organization responsible for publishing and auditing submissions as well as guide further development of BEHAVIOR/iGibson 2.0. BEHAVIOR 1.0 is not the end but it’s a great start at providing realistic tasks that any domestic robot would need to perform.

Benchmarks have always aided the development and assessment of new technologies. Having a in home robot benchmark like BEHAVIOR makes getting domestic robots that do what we want them to do a more likely possibility someday.

There’s a new benchmark in town and it signals the dawning of the domestic robot age.

Photo Credit(s):

Swarm learning for distributed & confidential machine learning

Posted on June 2, 2021June 2, 2021 by Ray in Artificial Intelligence, Data security, Deep Learning, Distributed computing, Information economy, Machine Learning, Neural network, R&D measures, Strategic Inflection Points

Read an article the other week about researchers in Germany working with a form of distributed machine learning they called swarm learning (see: AI with swarm intelligence: a novel technology for cooperative analysis …) which was reporting on a Nature magazine article (see: Swarm Learning for decentralized and confidential clinical machine learning).

The problem of shared machine learning is particularly accute with medical data. Many countries specifically call out patient medical information as data that can’t be shared between organizations (even within country) unless specifically authorized by a patient.

So these organizations and others are turning to use distributed machine learning as a way to 1) protect data across nodes and 2) provide accurate predictions that uses all the data even though portions of that data aren’t visible. There are two forms of distributed machine learning that I’m aware of federated and now swarm learning.

The main advantages of federated and swarm learning is that the data can be kept in the hospital, medical lab or facility without having to be revealed outside that privileged domain BUT the [machine] learning that’s derived from that data can be shared with other organizations and used in aggregate, to increase the prediction/classification model accuracy across all locations.

How distributed machine learning works

Distributed machine learning starts with a common model that all nodes will download and use to share learnings. At some agreed to time (across the learning network), all the nodes use their latest data to re-train the common model and share new training results (essentially weights used in the neural network layers) with all other members of the learning network.

Shared learnings would be encrypted with TLS plus some form of homomorphic encryption that allowed for calculations over the encrypted data.

In both federated and swarm learning, the sharing mechanism was facilitated by a privileged block chain (apparently Etherium for swarm). All learning nodes would use this blockchain to share learnings and download any updates to the common model after sharing.

Federated vs. Swarm learning

The main difference between federated and swarm learning is that with federated learning there is a central authority that updates the model(s) and with swarm learning that processing is replaced by a smart contract executing within the blockchain. Updating model(s) is done by each node updating the blockchain with shared data and then once all updates are in, it triggers a smart contract to execute some Etherium VM code which aggregates all the learnings and constructs a new model (or at least new weights for the model). Thus no node is responsible for updating the model, it’s all embedded into a smart contract within the Etherium block chain. .

Buthow does the swarm (or smart contract) update the common model’s weights. The Nature article states that they used either a straight average or a weighted average (weighted by “weight” of a node [we assume this is a function of the node’s re-training dataset size]) to update all parameters of the common model(s).

Testing Swarm vs. Centralized vs. Individual (node) model learning

In the Nature paper, the researchers compared a central model, where all data is available to retrain the models, with one utilizing swarm learning. To perform the comparison, they had all nodes contribute 20% of their test data to a central repository, which ran the common swarm updated model against this data to compute an accuracy metric for the swarm. The resulting accuracy of the central vs swarm learning comparison look identical.

They also ran the comparison of each individual node (just using the common model and then retraining it over time without sharing this information to the swarm versus using the swarm learning approach. In this comparison the swarm learning approach alway seemed to have as good as if not better accuracy and much narrower dispersion.

In the Nature paper, the researchers used swarm learning to manage the machine learning model predictions for detecting COVID19, Leukemia, Tuberculosis, and other lung diseases. All of these used public data, which included PBMC (peripheral blood mono-nuclear cells) transcription data, whole blood transcription data, and X-ray images.

Swarm learning also provides the ability to onboard new nodes in the network. Which would supply the common model and it’s current weights to the new node and add it to the shared learning smart contract.

The code for the swarm learning can be downloaded from HPE (requires an HPE passport login [it’s free]). The code for the models and data processing used in the paper are available from github. All this seems relatively straight forward, one could use the HPE Swarm Learning Library to facilitate doing this or code it up oneself.

Photo Credit(s):

From Nature magazine article Swarm Learning for decentralized and confidential clinical machine learning
From Nature magazine article Swarm Learning for decentralized and confidential clinical machine learning
From Nature magazine article Swarm Learning for decentralized and confidential clinical machine learning
From Nature magazine article Swarm Learning for decentralized and confidential clinical machine learning

Using AI to identify research to invest in

Posted on May 19, 2021August 9, 2022 by Ray in Artificial Intelligence, Data analytics, Deep Learning, Forecasting, Machine Learning, R&D measures, Visionary leadershp

Saw an article from MIT News (Using machine learning to predict high-impact research) on how researchers there were able to train an AI model to predict which scientific research was going to be the most impactful (foundational) over time. The news article was reporting on research written up in a Nature article (Learning on knowledge graph dynamics provides an early warning of impactful research, behind paywall). The researchers proposed that institutions and VC should use their new DELPHI (Dynamic Early-warning by Learning to Predict High Impact [research]) tool to find foundational research to invest in.

Attempts to identify good research have been active for years. For example, CiteSeerX and other’s like them, use an articles citation index to rank research. Citation indexes are sort of like Google’s page rank and uses a count of how many citations a research paper has garnered since publication as their metric of importance.

Although citation indices are a single, easy to calculate metric, they don’t seem to be a foolproof method to identify foundational research and it takes a number of years to become evident. The researchers at MIT decided to see if using an AI model to identify high impact research would work better.

But first please take our new poll:

How DELPHI works

Apparently, DELPHI uses article metadata, such as one can find looking at the Nature article behind this research (linked to above), to create a knowledge graph. They then use the knowledge graph and an AI model to predict whether the research will become high impact or not. The threshold they used for their publication was any research DELPHI predicts would be in the top 5% of all research in a domain.

Not having access to their paper (or code, see below), we can’t determine if they used a DNN or some other AI/data analytics approach to come up with their prediction.

The input data (article metadata) came from a website, Lens.org which provides metadata for ~230M research articles and ~130M patent filings. The researchers focused on life sciences as the domain to analyze to predict impact, but presumably their approach would work on any scientific domain.

The research analyzed all scientific articles for 42 life sciences journals (listed in articles supplementary information). They used as their training set articles written prior to 2017. And then used their model to predict the impact for articles published since 2018.

In the Nature article’s supplementary information they provide a table (Table 2) which lists some of life-sciences articles since 2018 that DELPHI predicts will have high ((top 5%) impact . There’s ~50 articles listed in the table and they supply the (knowledge) Full-graph (citation) count as well as citation counts for the articles.

2nd of 3 pages for table 2 in Nature article’s supplementary information

The Nature article’s home page also list links to the researchers code and data on one of the researchers GitHub repos. When I attempted to download the trained model and sample dataset, it generated a “links had expired” error message from Dropbox . The repo readme file suggested reaching out to the researcher if this happened. We did that, but had not received any response prior to this post’s publication. .

In any case, in the GitHub repository, there are a sample Jupiter notebook and dockerfile used to create a container to run the notebook in. The data they supplied, supposedly is a sample of 206 articles (metadata) and the notebook uses their model to predicts the impact level for those sample articles .

I would have liked to see more information on their model layer structure, hyper-parameters and other model information as well as prediction reliability statistics. But perhaps this is outlined in the Nature article or provided in the model download.

But the approach seems sound enough and even if the researchers didn’t use a DNN, it would easily lend itself to a DNN prediction, assuming you could :

1. Algorithmically create the knowledge graph from article metadata,

2. Digitize and quantify the metadata knowledge graph for all the articles, and

3. Had an independent assessment of impact levels for all research in the training set.

~~~~

Now if we could just do this for blog posts and podcasts it might be even more useful (for us).

Comments?

Towards a better AGI – part 3(ish)

Posted on May 10, 2021August 9, 2022 by Ray in Artificial Intelligence, Cognitive computing, Executive leadership, R&D measures, Scenario planning, Strategic Inflection Points, System effectiveness, Visionary leadershp

Read an article this past week in Nature about the need for Cooperative AI (Cooperative AI: machines must learn to find common ground) which supplies the best view I’ve seen as to a direction research needs to go to develop a more beneficial and benign AI-AGI.

Not sure why, but this past month or so, I’ve been on an AGI fueled frenzy (at leastihere). I didn’t realize this was going to be a multi-part journey otherwise, I would have lableled them AGI part-1 & -2 ( please see: Existential event risks [part-0], NVIDIA Triton GMI, a step to far [part-1] and The Myth of AGI [part-2] to learn more).

But first please take our new poll:

The Nature article puts into perspective what we all want from future AI (or AGI). That is,

AI-AI cooperation: AI systems that cooperate with one another while at the same time understand that not all activities are zero sum competitions (like chess, go, Atari games) but rather most activities, within the human sphere, are cooperative activities where one agent has a set of goals and a different agent has another set of goals, some of which overlap while others are in conflict. Sport games like soccer lacrosse come to mind. But there are other card and (Risk & Diplomacy) board games that use cooperating parties, with diverse goals to achieve common ends.
AI-Human cooperation: AI systems that cooperate with humans to achieve common goals. Here too, most humans have their own sets of goals, some of which may be in conflict with the AI systems goals. However, all humans have a shared set of goals, preservation of life comes to mind. It’s in this arena where the challenges are most acute for AI systems. Divining human and their own system underlying goals and motivations is not simple. And of course giving priority to the “right” goals when they compete or are in conflict will be an increasingly difficult task to accomplish, given todays human diversity.
Human-Human cooperation: Here it gets pretty interesting, but the paper seems to say that any future AI system should be designed to enhance human-human interaction, not deter or interfere with it. One can see the challenge of disinformation today and how wonderful it would be to have some AI agent that could filter all this and present a proper picture of our world. But, humans have different goals and trying to figure out what they are and which are common and thereby something to be enhanced will be an ongoing challenge.

The problem with today’s AI research is that its all about improving specific activities (image recognition, language understanding, recommendation engines, etc) but all are point solutions and none (if any) are focused on cooperation.

Tit for tat wins the award

To that end, the authors of the paper call for a new direction one that attempts to imbue AI systems with social intelligence and cooperative intelligence to work well in the broader, human dominated world that lies ahead.

In the Nature article they mentioned a 1984 book by Richard Axelrod, The Evolution of Cooperation. Perhaps, the last great research on cooperation that was ever produced.

In this book it talked about a world full of simulated prisoner dilemma actors that interacted, one with another, at random.

The experimenters programmed some agents to always do the proper thing for their current partner, some to always do the wrong thing to their partner, others to do right once than wrong from that point forward, etc. The experimenters tried every sort of cooperation policy they could think of.

Each agent in an interaction would get some number of points for an interaction. For example, if both did the right thing they would each get 3 points, if one did wrong, the sucker would get 1 and the bad actor would get 4, both did wrong each got 1 point, etc.

The agents that had the best score during a run (of 1000s of random pairings/interactions) would multiply for the the next run and the agents that did worse would disappear over time in the population of agents in simulated worlds.

The optimal strategy that emerged from these experiments was

Do the right thing once with every new partner, and
From that point forward tit for tat (if the other party did right the last time, then you do right thing the next time you interact with them, if they did wrong the last time, then you do wrong the next time you interact with them).

It was mind boggling at the time to realize that such a simple strategy could be so effective/sustainable in simulation and perhaps in the real world. It turns out that in a (simulated) world of bad agents, there would be this group of Tit for Tat agents that would build up, defend itself and expand over time to succeed.

That was the state of the art in cooperation research back then (1984). I’ve not seen anything similar to this since.

I haven’t seen anything like this that discusses how to implement algorithms in support of social intelligence.

~~~~

The authors of the Nature article believe it’s once again time to start researching cooperation techniques and start researching social intelligence so we can instill proper cooperation and social intelligence technology into future AI (AGI) systems .

Perhaps if we can do this, we may create a better AI (or AGI) so that both it and we can live better in our world, galaxy and universe.

Comments?

cOAlition S requires open access to funded research

Posted on January 26, 2021January 26, 2021 by Ray in Business economics, data access, Information economy, R&D measures, Strategic Inflection Points, Visionary leadershp

I read a Science article this last week (A new mandate highlights costs and benefits of making all scientific articles free) about a group of funding organizations that have come together to mandate open access to all peer-reviewed research they fund called Plan S. The list of organizations in cOAlition S is impressive including national R&D funding agencies from UK, Ireland, Norway, and a number of other countries, charitable R&D funding agencies from WHO, Welcome Trust, Bill&Melinda Gates Foundation and more, and the group is also being funded by the EU. Plan S takes effect this year.

Essentially, all research funded by these organizations must be immediately published in open access forum, open access journals or be freely available in an open access section of a publishers website which means it could be free to be read by anyone worldwide with access to the web. Authors and institutions will retain copyright for the work and the work will be published under an open access license such as the CC BY (Creative Commons Attribution) license.

Why open access is important

At this blog, frequently we find ourselves writing about research which is only available on a paid subscription or on a pay per article basis. However, sometimes, if we search long enough, we find a duplicate of the article published in pre-print form in some preprint server or open access journal.

We have written about open access journals before (see our New Science combats Coronavirus post). Much of what we do on this blog would not be possible without open access journals like PLoS, BioRxiv, and PubMed.

Open access mandates are trending

Open access mandates have been around for a while now. And even the US Gov’t got into the act, mandating all research funded by the NIH be open access by 2008, with Dept of Agriculture and Energy following later (see wikipedia Open access mandates).

In addition, given the pandemic emergency, many research publishers like Nature and Elsevier made any and all information about the Coronavirus free access on their websites.

Impacts and R&D research publishing business model

Although research is funded by public organizations such as charities and government agencies, prior to open access mandates, most research was published in peer-reviewed journal magazines which charged a fee for access. For many research organizations, those fees were a cost of doing research. If you were an independent researcher or in an institution that couldn’t afford these fees, attempting to do cutting edge research was impossible without this access.

Yes in some cases, those journal repositories waved these fees for deserving institutions and organizations but this wasn’t the case for individual researchers. Or If you were truly diligent, you could request a copy of a paper from an author and wait.

Of course, journal publishers have real expenses they needed to cover, as well as make a reasonable profit. But due to business consolidation, there were fewer independent journals around and as a result, they charged bundled license fees for vast swathes of research articles. Such a wide bundle may or may not be of interest to an individual or an institution. That plus with consolidation, profits were becoming a more significant consideration.

So open access mandates, often included funding to cover fees for publishers to supply open access. Such fees varied widely. So open access mandates also began to require fees to be published and to be supplied a description how prices were calculated. By doing so, their hope was to make such costs more transparent

Impacts on authors of research articles

Somewhere there’s an aphorism for researchers that says “publish or perish“, which means you must publish research in order to become a recognized expert in your field. Recognition often the main driver behind better academic employment and more research funding.

However, it’s not just about volume of published papers, the quality of research also matters. And the more highly regarded publishing outlets have an advantage here, in that they are de facto gatekeepers to whats published in their journals. As such, where you publish can often lend credibility to any research.

Another thing changed over the last few decades, judging the quality of research has become more quantative. Nowadays, research quality is also dependent on the number of citations it receives. The more popular a publisher is, the more readers it has which increases the possibility for citations.

Thus, most researchers try to publish their best work in highly regarded journals. And of course, these journals have a high cost to provide open access.

Successful research institutions can afford to pay these prices but those further down the totem pole cannot.

Most mandates come with additional funding to support paying the cost to supply open access. But they also require publishing and justifying these. In the belief that in doing this so it will lend some transparency to these costs.

So the researcher is caught in the middle. Funding organizations want open access to research they fund. And publishers want to be paid a profit for that access.

History of research publication

Nature magazine first started publishing research in 1859, Science magazine first published in 1880, the Royal Society first published research in 1665. So publishing research has been going on for 350 years, and at least as a for profit business model, since the mid-1800s.

Research prior to being published in journals was only available in books. And more than likely, the author of the research had to pay to have a book published and the publisher made money only when those books were sold. And prior to that, scientific research was mostly only available in a course of study, also mostly paid for by the student.

So science has always had a cost to access. What open access mandates are doing is moving this cost to something added to the funding of research.

Now if open access can only solve the reproducibility crisis in science we could have us a real scientific revolution.

Comments?

Photo Credits:

From the cOAlition S website
From the PLoS website
From the BioRxiv website
From wikipedia article on Sir Isaac Newton

The beginning of the end of cancer

Posted on January 13, 2021January 13, 2021 by Ray in Data analysis/Big Data, R&D measures, Strategic Inflection Points, Visionary leadershp

Read an article today about new research done to apply big data analytics against multiple cancer strains to identify key control mechanisms that allow cancer to survive in the body and multiply. The article Big data analysis find cancer’s key vulnerabilities discusses their discovery of 24 “master regulators” that are present in a number of different cancers. The original research article is in Cell (behind paywall) but I managed to find a preprint on BiorXiv.

From a (software) coding perspective, it’s almost like a majority of cancers are re-using the same modules to perform functions that are needed by the cancer cells. Not all cancers exhibit all master regulator blocks but all the cancers that they have examined have some of them.

The researchers examined the regulatory/signaling networks of proteins in 112 cancer cell lines. They identified 407 master regulatory proteins and further analysis showed that these protiens were associated with 24 master regulatory architectures (oncotectures). A decent laymen description of a cancer oncotecture can be found in an old (2016) Economist Article Cancer’s master criminals…

Master regulatory proteins

According to the Economist article master regulatory proteins are proteins that regulate processes in a cancer cell that cause other proteins to be made, which cause other proteins to be made, etc. which affect the way a cancer cell lives and propagates inside a body.

Biologists call these sorts of proteins transcription factors which controls the copying of DNA information into mRNA which are then taken to protein factories to create proteins from that blueprint.

The research team believe they24 master regulatory (MR) blocks, if they could be disabled somehow, would disrupt the cancer cell and ultimately eliminate that cancer from a body.

It’s almost like a DevOps script that automates the deployment of software inside the cloud. The fact that they have identified 24 master regulatory (MR blocks) architectures (sequences of proteins that are occur) that apply to a wide set of cancer tumor sub-types implies that these could be needed to regulate the functionality of these cancers. If drugs could be devised to interrupt, change or deactivate these master regulatory blocks it’s quite possible that these cancers would be eliminated.

Identifying MR Blocks using (Bio/Life Sciences) Big Data

It all starts with VIPER analysis (GitHub repo) that measures a specific proteins transcriptional activity level. In this fashion they were able to analyze the 112 tumor subtype proteome (the total complement of all proteins active in a cell). And whittle these down, using cluster analysis to those that were especially relevant for the cancer cell transcription activity.

They then used DIGGIT analysis (GitHub repo of R implementation) to identify the MR proteins and identify cellular mutations that led to them. The types of mutations can be copy number, single point or gene fusion. DIGGIT analysis can help identify which of the mutations are responsible for the protein being analyzed. The DIGGIT process is a multi-step, analytical approach to identifying candidate MR proteins.

Then using tumor checkpoint hypothesis and Bayesian analysis/integration they further ranked the MR candidate proteins. Tumor checkpoints are state transitions in the life of a cancer cell where the cell assesses its environment and then determines what actions to take next.

The tumor checkpoint hypothesis says that during the life cycle of a cancer cell it goes through various state transitions. The researchers have shown that these state transitions are managed by the MR blocks they have identified.

In the final step in their analysis, they used tumor checkpoint hypothesis and modularity with saturation & modularity analysis to identify top MR proteins and the MR blocks active in the 112 tumor subtypes.

At the end of their analysis, they had identified 24 MR blocks which solely or in some combination are present in each of the 112 tumor subtypes. If these MR blocks could be attacked by specific drugs then each of these 112 tumor subtypes could essentially be eliminated from a body or rather cure that cancer.

Photo Credit(s):

Figure 6 G from A Modular Master Regulator Landscape Determines the Impact of Genetic Alterations on the Transcriptional Identity of Cancer Cells article in BiorXix
Wikipedia by Kevin13
Figure 1 A, B, & C from A Modular Master Regulator Landscape Determines the Impact of Genetic Alterations on the Transcriptional Identity of Cancer Cells article in BiorXiv

Where should IoT data be processed – part 2

Posted on September 28, 2020September 28, 2020 by Ray in Crowdsourcing, Data efficiency, data logistics, Distributed computing, IoT, Networking, R&D measures, Strategic Inflection Points

I wrote a post a while back on Where IOT data should be processed – part 1. We will get back to that post in a moment, but recently I read an article (How big data forced the hunt for ET intelligence to evolve) that mentioned after 20 years, they were shutting down SETI@home.

SETI@home was a crowdsourced computational network that took snippets of radio spectrum, sent them to 1000s of home computers to be analyzed during idle computer time, once processed the analysis was sent back to SETI@home. It was one of the first to use a crowdsourced approach to perform data processing. The data was collected at a radio telescope, sent to SETI@home and distributed from there.

6 Factors for IOT data processing

In my post I talked about 6 factors that should help determine where data is processed. Those 6 factors included

Data size which is a measure of the amount (GB, TB or PBs) of data that is being generated at an IOT node
Data pipe availability, which is all about the networking bandwidth that’s available at the IOT node. If we are talking some sort of low-bandwidth networking access then it probably makes sense to process the data more locally and send only results of processing up the stack.
Processing criticality which indicates how important is the processing of the data. If the processing could save a life then maybe it should be done as close as possible to where the data is generated. If the data processing is less critical it could perhaps be done at other nodes in an IOT network
Processing time and infrastructure cost which is all about what sort of computational resources are required to perform the processing and how much would it cost. If processing of the data is to undergo multiple passes or requires multi-core CPUs or GPUs, moving data off the IoT node and onto a more comprehensive server to process it, could make sense.
Compliance, governance and archive requirements, which discussed the potential need for all data to be available for regulatory audits and as such may need to be available at a central location anyway so why not perform processing there.
Data information funnel, which talked about the fact that an IoT network should be configured in layers and that each layer in the stack should probably be responsible for some portion of the data processing needed by the overall system, if nothing more than compressing the information before it is sent elsewhere.

Now that I review the list, the last, Data information funnel, factor really should be a function of the other factors rather than a separate factor.

In that blog post I promised to follow it up with some examples of the logic applied to real world problems. SETI is the first one I’ve seen in the literature

SETI’s IoT processing problem

Closeup front view of one antenna of the Allan Telescope Array, a radio telescope for combined radio astronomy and SETI (Search for Extraterrestrial Intelligence) research being built by the University of California at Berkeley, outside San Francisco. The first phase, consisting of 42 6 meter dish antennas like the one shown here, was completed in 2007. Eventually it will have 350 antennas. This type of antenna is called an offset Gregorian design. The incoming radio waves are reflected by the large parabolic dish onto a secondary concave parabolic reflector in front of the dish, and then into a feed horn. A metal shroud can be seen along the bottom of the secondary reflector which shields the antenna from ground noise. It covers the frequency range from 0.5 to 11.2 GHz.

The SETI researchers found that “The telescopes are now capable of producing so much data that it’s not possible to get that volume of data out to volunteers,” And “The discovery space is in these massive, massive data streams. And it’s just not efficient to distribute many terabits per second out to volunteers all over the world. It’s more efficient for that data processing to happen at the actual observatory.”

So they moved the data processing for the SETI IoT network from being distributed out to home computers throughout the world to being done at the (telescope) source where the data was originally generated.

This decision seems to rely on a couple of the factors above. Namely the pipe availability and data size factors. They had to move processing because no pipes existed to send Tb of data to 1000s of home computers. And finally, the processing time and infrastructure cost has come down so much, that it was just easier to do the processing onsite.

It doesn’t seem like processing criticality or compliance-governance-archive had any bearing on the decision.

So there’s the first example that seems to fit well into our data processing framework.

~~~~

We ought to be able to come up with a formula that uses all these factors and comes up to with a yes or no as to whether to process the data on the node or not.

Photo Credit(s)

NASA/ESA/The Hubble Heritage Team (STScI/AURA)
NASA/ESA/The Hubble Heritage Team (STScI/AURA)
Colby Gutierrez-Kraybill