January 9, 2026
These days, I develop with a Q&A AI tool, a systematic, slow-and-steady human-and-AI approach, and/or parallel agent mode. Mixed in there is hand-written code, supplementary research from docs or online, and hopefully some human intelligence.
November 11, 2025
I find that it’s effective to nudge a RAG system to prioritize returning quotes and sources.
June 13, 2025
…which can then be served on-device, for free, as part of your iOS app.
May 16, 2025
You’ve probably wanted to deploy a chatbot lately.
April 24, 2025
While ChatGPT or similar can now search the web, your proprietary content is probably unavailable online, and therefore can’t be used to provide more “contextual” responses.
October 28, 2024
When training AI, engineers begin with some random numbers.
September 3, 2024
BERT is a language model released in late 2018, soon after GPT-1. There are many similarities, and some major differences.
August 6, 2024
Actual machine learning code usually takes up a minority of your project code [1].
August 1, 2024
In classification, you want to minimize your mistakes, or miss-classifications. The fewer mistakes, the better your model.
July 26, 2024
For machine learning on tabular data, you probably should use XGBoost.
June 13, 2024
Easily obtain common results and outputs. This should get you 80% of the way.
May 13, 2019
Makefiles are a simple way to organize code execution/compilation.
March 24, 2018
Great design comes from great designers.
December 2, 2017
In machine learning, cross-validation is always the answer. But what are the questions?
November 20, 2017
When normal equations are computationally expensive, such as with large feature spaces, linear regression may be solved using the Gradient Descent algorithm.
August 30, 2017
In software engineering, there is a rule of three which can make your programming work more efficient.
May 9, 2017
tmux manages your terminal windows to help you work on multiple projects.
April 17, 2017
Below are some characteristics of a reproducible analysis, in contrast to a non-reproducible analysis.
February 10, 2017
Do you love R’s dplyr?
July 12, 2016
If you are looking for it, here is one framework to distinguish statistical modeling from machine learning. It hinges on whether or not you are interested in the interpretability of your model.
July 6, 2016
“Algorithms don’t understand the subtlety and the mixing of [music] genres. So we hired hundreds of the best people we know.” - Jimmy Iovine, on curation in Apple Music.
June 7, 2016
R is a fantastic tool for data analysis, and you can take it to the next level by
learning the pipe %>% operator and using the packages dplyr, ggplot2,
broom and a few others.
January 19, 2015
Below are notes from a talk at the Harvard Innovation Lab by Jim Waldo, CTO and instructor, Harvard University, on January 19, 2015. The following notes cover about 85% of the talk. See bottom for a bonus.
November 14, 2014
Rather than burying its data collection motives and tactics in an obscure privacy policy, the Guardian has set up a portal* with articles and videos explaining these in layman’s terms.
November 7, 2014
Boston Data Festival was this week, packed with excellent talks on topics from “Quantifying Culture” to “Evaluating Trading Algorithms Using Probabilistic Programming.”
September 24, 2014
The highs and lows of Google Flu trends, “once a poster child for the power of big-data analysis,” serve as a case study for David Lazer, Ryan Kennedy, Gary King, et al, in their Science Magazine* article “The Parable of Google Flu: Traps in Big Data Analysis.”
August 26, 2014
From a New York Times article* “Inside Apple’s Internal Training Program”, some highlights on simplicity in Apple:
November 6, 2013
From a Reuters column*:
April 4, 2013
Over the weekend of Apple’s April 3 release of the iPad, 73% of circulated tweets were favorable toward the iPad, but 26% expressed disappointment that the iPad could not replace the iPhone, according to a study.
January 27, 2013
Steve Lohr at the New York Times* writes a reflection on the promises of Big Data, citing increasing buzz, yet also its initial big failure:
May 14, 2012
Or, when not to be too precise.
2026 | Site based on the Primer Theme.