Thursday, February 27, 2014

When (specialized parts of) two heads are better than one...

A recent review highlighted the small army of databases that has sprung up to help keep track of what we're learning about cells. Many of these databases focus on a particular feature of cell signaling (like protein-protein interactions or post-translational modifications), with a few databases combining information across multiple features to help build a more complete picture. A question that remains is how these collections of information can be used to help us achieve practical goals - identifying drug targets or predicting the physiological effects of mutations.

Computational modeling could have a role to play by turning descriptions of interactions into quantitative predictions. As databases tend to be managed by groups of people, one might expect that large-scale modeling projects could also benefit from a community-driven approach. However, modeling tends to be carried out by individuals or small groups. Are there ways to turn modeling into a community activity?

A first step is probably to put models into a format that is easy to navigate and that encourages interactions among people. One such format is a wiki, and there are actually a few examples of wikis being used to simultaneously annotate models and to consolidate information about a signaling pathway - a little like an interactive literature review that you can simulate on a computer. I think this is a cool concept, although it seems like these wikis tend to stop being updated soon after their accompanying paper is published. There have also been some efforts to establish databases for models, which would in principle make it easier for people to build on past work. But in practice, so far, it seems that these databases are not very active either.

Reinventing Discovery: The New Era of Networked Science  
[Review]
The issues involved in community-based modeling is also something I thought about when I read (the verbose yet interesting) "Reinventing Discovery" by Michael Nielsen, a book that advocates for "open science": a culture in which data and ideas are shared freely, with the goal of facilitating large-scale collaborations among people with diverse backgrounds. The underlying motivation is that progress can be accelerated if problems are broken down into modular, specialized tasks that can be tackled by experts in a particular area. I can see how such an approach would be beneficial in modeling and understanding cell signaling - a topic that can encompass everything from ligand-receptor interactions to transcriptional regulation to trafficking, each of which are complicated fields in their own right. So, how can experts in these fields be encouraged to pool their knowledge?

Nielsen's book has many examples of where collaborative strategies in science have succeeded and failed. As it turns out, creating wikis just for the sake of it is not always a good idea, because scientists often have little incentive to contribute. They would (understandably) prefer to be writing their own papers rather than spending time contributing to nebulous community goals. It seems like in most examples of where "collective intelligence" has succeeded, specific rewards have been in participants' minds. There's Foldit, the online game where players compete at predicting protein structures. And perhaps the most famous example is Kasparov vs. The World. (It's noteworthy that in both these examples, many participants are not trained professionals in the activity that they are participating in - structural biology and chess, respectively.)

I wonder what the field of cell signaling can learn from these examples. Does there need to be a better incentive for people to help with wikis/databases? One might imagine a database where an experimentalist can contribute a piece of information about a protein-protein interaction, which would automatically gain a citation any time it was used in a model. Or, can some part of the modeling process be turned into a game or other activity that many people would want to participate in? It seems like there are a lot of possibly risky, but also possibly rewarding, paths that could be tried.

Saturday, February 22, 2014

Logical modeling vs. rule-based modeling

Cell signaling systems have been modeled using logical and rule-based approaches. What's the difference? A rule-based model is similar to a logical model, in that both types of models involve rules. However, the rules are usually rather different in character. In a typical logical model, rules define state transitions of biomolecules, including conditions on these transitions. They have an "if-then" flavor. The rules operate on variables representing states of whole biomolecules, and they define when and how such state variables change their values. Biomolecules in logical models are often characterized by state variables that take one of two values, e.g., 0 or 1. Such variables are introduced to represent "on" and "off" states. More than two states can be considered, but there is a limit to what's tractable, as the reachable state space tends to grow exponentially with the number of possible states. As more states are considered, there are more and more transitions between these states, each of which is usually considered explicitly when specifying a logical model. The behavior of a logical model can sometimes depend on the algorithmic protocol used for changing states in a simulation. This seems undesirable. In a rule-based model, the amount of an activated protein can be continuous or discrete, from 0 copies to all copies of the protein. This is because a rule-based model is based on the principles of chemical kinetics. The state variables implicitly defined by rules capture numbers of biomolecules in particular states and/or complexes. Rules are associated with rate laws, which govern the rates or probabilities of transitions of biomolecular site states, not the state transitions of whole molecules. With a physicochemical foundation, it is relatively easy to capture certain phenomena found in cell signaling systems, such as competition, feedback, and crosstalk. These phenomena are more difficult to capture in a logical model. At least, it seems that way to me. With model-specification languages such as BNGL (http://bionetgen.org), a set of rules can be used to perform different tasks: stochastic or deterministic simulation, via a direct or indirect method. Is it possible to modify BNGL to enable logical modeling? Although typical logical models are different from typical rule-based models, it does not seem that the rules used in the two types of models, although usually different, are necessarily fundamentally different, so my answer is a tentative "yes." What do you think? 

Tuesday, February 11, 2014

Dismantling the Rube-Goldberg machine

What do this puppy food commercial, an indie music video, and systems biology have in common?

They've all used Rube-Goldberg machines to great effect - devices that execute an elaborate series of steps to accomplish a goal that could have been reached through a (much) simpler process. As the ultimate elevation of means over ends, these machines have become a celebrated expression of ingenuity and humor. However, the presence of such machines in systems biology is perhaps not as obvious, intentional, or entertaining.

So what are the Rube-Goldberg machines of systems biology? In a small but a noticeable fraction of studies, complex models are used to reach conclusions that could be obtained just by looking at a diagram or by giving some thought to the question at hand - assuming that a question is at hand. It seems as though these studies primarily use models to produce plots and equations that reinforce, or embellish, intuitive explanations. However, the true usefulness of models comes into play when we leave the territory of intuition and begin to wonder about factors that can't be resolved by just thinking.

So when and why do we start thinking like Rube-Goldberg engineers, and what impact does it have on the field? A few educated guesses:
  • Some models are built without a question in mind. Its creators then search for a question to address, and end up with one that the model's content isn't well-suited to. 
  • We're all specialists in something, and we don't always know about all the tools and capabilities that others have developed. As a result, we sometimes try to solve a problem by reinventing the wheel, or by applying a tool that isn't a good fit for the problem, which can lead to all kinds of complications. 
  • To some audiences, just the concept of doing simulations seems impressive. As a result, modelers can be drawn into just putting technical skills on display and establishing a mystique around what they do, as opposed to applying their abilities to interesting questions.  
  • Obvious predictions may be easier to validate experimentally. 
I don't know if these practices have had a wholly negative impact on modeling efforts in biology - it may have even helped in some respects. But it would not be a bad idea to focus on challenging questions for which simulations are actually needed, and to try to get the most out of the models that we've taken the time and effort to build.

Friday, February 7, 2014

Do modelers have low self esteem?

When was the last time an experimental biologist experienced a manuscript rejection because the work in question didn't include modeling? As a modeler working in biology, I tend to be hesitant about trying to report work that doesn't include new data (from a collaborator), because it doesn't usually go well. Modeling without experimentation tends to be held in low regard, especially among modelers, which is a tragic irony. If physicists of the early 20th century had the same attitude as many of today's modelers and experimental biologists, "Zur Elektrodynamik bewegter Körper" would not have been published without its author first doing experiments to confirm his ideas or him finding an experimental collaborator to generate confirmatory data. I don't think that modelers should take a favorable view of every modeling study they come across but I wonder if we need to be more supportive of each other and allow more room for independence from collaborations with experimentalists. If a modeling study is based on reasonable assumptions and performed with care and it produces at least one non-obvious testable prediction, why should it not be reported immediately? It seems that some of us might be concerned that such reports will be ignored or that such reports are too untrustworthy, given all the complexities and ambiguities. It's true that models need to be tested, but it seems unlikely that someone able to build and analyze a model will also be the best person to test the model or to have a circle of friends that includes this special person. Indeed, I think the requirement to publish with data has led some modelers to produce predictions that are, let's say, "obvious," because this is the type of prediction can be confirmed easily. Let's be rigorous, but to a reasonable standard. Let's also be bold. Many experimental results turn out to be misinterpreted, or plain wrong. It's OK for models to be wrong too. Biological systems are complicated. We need models to guide our study of these systems. Most of the work being done in biology today is being performed without models. Until experimentalists start chiding each other for failing to leverage the powerful reasoning aids that models are, it makes little sense for modelers to criticize each other for work that doesn't include generation of new data.