29 Apr 23
A natural question that beginners often ask is: Why doesn’t Forth have features that are standard in other languages, for example, arrays? The answer is that Forth is so facile at creating new data types that it is usually easier to invent something that exactly suits your needs than it is to force your program to conform to an arbitrary standard.
03 Apr 23
We propose encoding a 16-bit string as a proquint of alternating consonants and vowels as follows. Four-bits for consonants, and two-bits for vowels:0 1 2 3 4 5 6 7 8 9 A B C D E F | 0 1 2 3b d f g h j k l m n p r s t v z | a i o uSeparate proquints using dashes, which can go un-pronounced or be pronounced “eh”.
We propose encoding a 16-bit string as a proquint of alternating consonants and vowels as follows. Four-bits for consonants, and two-bits for vowels:0 1 2 3 4 5 6 7 8 9 A B C D E F | 0 1 2 3b d f g h j k l m n p r s t v z | a i o uSeparate proquints using dashes, which can go un-pronounced or be pronounced “eh”.
27 Mar 23
Free your Apple Notes data from Notes.app. Contribute to HamburgChimps/apple-notes-liberator development by creating an account on GitHub.
13 Mar 23
A look into just who constitutes Maine’s growing population.
07 Mar 23
A guest post by Aaron Carr.
03 Mar 23
Wikidata is a free and open knowledge base that can be read and edited by humans and machines. Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.
An intuitive UI for managing data, for users of all technical skill levels. Built on Postgres.
20 Feb 23
Kanaries/graphic-walker: An open source alternative to Tableau. Easily embedded as a component in web apps.
An open source alternative to Tableau. Easily embedded as a component in web apps. - Kanaries/graphic-walker: An open source alternative to Tableau. Easily embedded as a component in web apps.
14 Feb 23
Perform automated data exploration with a painter-like interface, use the Data Painter feature in RATH.
28 Jan 23
Tracking these activities has served as a helpful reference for when a friend asks me what I’ve read lately or for a good recipe. Everything is in one place. I was also surprised to learn that I made 96 new recipes last year, a number I wouldn’t have known, or even been able to guess, had I not tracked.
10 Jan 23
05 Jan 23
17 Dec 22
Open source, flexible, scales to your needs. Confidently move, transform, and test your data using tools you know with a data engineering workflow you’ll love.
15 Dec 22
The machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains. To address this gap, we propose datasheets for datasets. In the electronics industry, every component, no matter how simple or complex, is accompanied with a datasheet that describes its operating characteristics, test results, recommended uses, and other information. By analogy, we propose that every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended uses, and so on. Datasheets for datasets will facilitate better communication between dataset creators and dataset consumers, and encourage the machine learning community to prioritize transparency and accountability.
OpenRefine is a powerful free, open source tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
string_grouper is a library that makes finding groups of similar strings within a single, or multiple, lists of strings easy — and fast. string_grouper uses tf-idf to calculate cosine similarities within a single list or between two lists of strings. The full process is described in the blog Super Fast String Matching in Python.