If you've ever wanted to "run a study" on some aspect of your personal life, this site is for you. It lets you "science the heck" out of your pet hypotheses—building simple predictive models (regressions) around your data. All the number crunching runs in your browser. So your data stays on your device (unless you activly download and share it, which I tried to make easy).
Greetings dear reader, my name is David Colarusso, and I teach law students about data, machine learning, and “AI.” I made this tool to help my students get a feel for narrow AI (machine learning) by building their own. As Richard Feynman observed, “that which I cannot create, I do not understand.” That being said, I don’t expect my students to become statisticians or data scientists. Rather, I want them to learn enough to understand the realm of the possible and to call BS when needed. To that end, this site lets you create your own toy models, intentional simplifications that help folks explore the dynamics of a situation. That being said, keep in mind that the tools provided here are that dangerous mix of power and ease of use. If you can't make it to my class, please consult a statistics text before assuming you know what they are telling you. ;)
I haven't embedded analytics or tracking here, and all your data stays in your browser. I can't see it. My hope is this will free you to experiment honestly. But this means I don't know if folks are even using this tool. So, I'd appreciate you sharing what you're up to. You can find me @Colarusso@mastodon.social or use this feedback form.
Want to look at the code or learn more about how to use this site? Check out this GitHub repo. Regardless, you can get started by making a new model below.
Note: This site was optimized for uses as a web app. So, you should add it to your home screen if you're on a phone.
Your data lives in your browser's LocalStorage unless you take affirmitive steps to download and share it (e.g., downlaoding a model). The training of all models is done in your browser. No warrantee is offered as to the acuracy of predictions made by models you or others authored. This site is intended to help users explore and become familiar with the promise and limitations of simple prediction algoritums. You can learn more and explore the site's code on its GitHub repo. These terms are subject to change at anytime.
The site does NOT make use of tracking cookies or the like to measure engagement.
THE SITE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SITE OR THE USE OR OTHER DEALINGS IN THE SITE.
Maybe you'll create a sleep log and figure out when it's safe to have that last coffee or discover some feature you didn't think was important really is the best predictor of your mood but only if you take the time to explore possible features and collect lots of data. Even then, remember, correlation isn't causation; consider the context. And beware, for if you torture the data long enough, it will confess to anything.
Now go ahead, make a model.
Your model should ask a question. Specificly, a question where the answer is some number, or alternativly, "yes or no." First, let's decide what type of question you're trying to answer.
Now lets go ahead and pose your question (e.g., "Will it rain tomorrow?" or "What will the high temperature be tomorrow?").
If you'd like, you can write a little note that will appear in the code to provide more context.
Your model will use inputs/features to predict the answer to your question. That is, the are the observations upon which it will base its prediction. Choose a short one or few-word group to signafy this variable. It should make a good variable name ("Today's high temp.").
Provide more information to show up when collecting this data point.
As with the answer to your question, your inputs/features can be of two types: continuous (numbers) or categorical (one option from a list of options). Unlike your answer, however, if you choose categorical you can make the list of options as long as you like.
The units of measure for your feature (degrees Fahrenheit)
The lowest value you expect to collect.
The highest value you expect to collect.
The average value for this measure.
Build a list of all possible options.
Based on n observations