knowit-doit | Where the Genes Flow

Cleveland, OH

The Challenge | Where the Genes Flow

Map and compare population genetics of a species with landscape features, climate conditions, and human activities in a region to identify potential barriers or facilitators to gene migration and local adaptation.

knowit-doit: This thing called life!

Finding Correlation in Multi dimensional Data. Each step of the procedure is parsed in a concise and easily comprehensible manner (called Stringules) so that anyone can understand irrespective of background.


Background: Internet is full of information. But, still this information seems so limited. Whenever I try to find answers to moderately esoteric topic, I hit a road block pretty fast. For instance, try to find a way to calculate correlations between more than 2 data sets at the same time. Or try to find a specific programming script that would help you find patterns.

Most of the information that you'll gather after your multi hour search would most probably skim only the surface of these topics. With my solution, I envisage making information about esoteric topics available in a readily understandable and concise manner. For this purpose, a couple of months back, I set upon this task to create a website called This site provides information about complex or difficult to understand topics in the form of little packets of knowledge - called Stringules. Since each person has a different preference for understanding things, I provide multiple ways to share information. My algorithm (written in PERL) asks users few questions and based on their responses, it decides the best way to share information.

In the current solution to NASA Space Apps Challenge 2017, I used the similar process to work on "where the genes flow", "dictionary of earth" and "migratory birds" challenges. In the video attached above, I describe a proof of concept to 1) explain a moderately esoteric topic of descriptive statistics (called Principal Component Analysis or PCA) and 2) to use it to find correlations among data obtained from NASA resources.

On purpose, I used a two dimensional data set as it was easier to show each individual step of the process. In the long run, I hope to 1) Continue creating Stringules for esoteric topics (which is relevant to "Dictionary of Earth" challenge) and 2) use NASA data and calculate multi-dimensional correlations and find solutions to some of the major problems of the planet.

One difficulty I faced during the execution of this solution was my inability to join Space App Challenge physically. The reason being I am a biophysics research scientist in Cleveland, USA and currently am on a US VISA. My Space App Challenge location (NASA Glenn Center) only allowed US citizens in the facility. I didn't get enough time to travel to another city for participation. Due to all that I was short in time and couldn't build a team and was able to contribute virtually only.

PS: For efficient and interactive user experience, I am in the process of converting knowit-doit into an app as well.

For a video with narration, please see :

My website with Stringules is at :


SpaceApps is a NASA incubator innovation program.