Simon Willison’s Weblog

Subscribe

May 2018

38 posts: 2 entries, 19 links, 8 quotes, 9 beats

May 2, 2018

May 3, 2018

Iodide Notebook: Project Examples (via) Iodide is a very promising looking open source JavaScript notebook project, and these examples do a great job of showing what it can do. It’s not as slick (yet) as Observable but it does run completely independently using just a browser.

# 6:42 pm / javascript, jupyter, observable

May 5, 2018

Sighting 2:50 PM — Chimango Caracara, in Luján de Cuyo CNC2019, MZ, AR
Chimango Caracara
Chimango Caracara

Datasette 0.21: New _shape=, new _size=, search within columns. Nothing earth-shattering here but it’s accumulated enough small improvements that it warranted a new release. You can now send ?_shape=array to get back a plain JSON array of results, ?_size=XXX|max to get back a specific number of rows from a table view and ?_search_COLUMN=text to run full-text search against a specific column.

# 11:25 pm / projects, datasette

May 7, 2018

Somebody should write up how the early-2000s push for open standards and the Web Standards Project’s advocacy are a major factor in why Apple was able to create its enormously valuable comeback. Put another way, one of the killer moments of the first iPhone demo was Jobs saying it had the “real” web, not the “baby” web, by demonstrating the NYT homepage. That would’ve been IE-only & Windows-only if not for effective advocacy from the web standards community.

Anil Dash

# 1:28 pm / anil-dash, apple, web-standards, web-standards-project

May 8, 2018

mendoza-trees-workshop (via) Eventbrite Argentina has an academy program to train new Python/Django developers. I presented a workshop there this morning showing how Django and Jupyter can be used together to iterate on a project. Since the session was primarily about demonstrating Jupyter it was mostly live-coding, but the joy of Jupyter is that at the end of a workshop you can go back and add inline commentary to the notebooks that you used. In putting together the workshop I learned about the django_extensions “/manage.py shell_plus --notebook” command—it’s brilliant! It launches Jupyter in a way that lets you directly import your Django models without having to mess around with DJANGO_SETTINGS_MODULE.

# 5:22 pm / django, speaking, my-talks, tutorial, eventbrite, jupyter

May 9, 2018

Notes from my appearance on the Changelog podcast

After I spoke at Zeit Day SF last weekend I sat down with Adam Stacoviak to record a 25 minute segment for episode 296 of the Changelog podcast, talking about Datasette. We covered a lot of ground!

[... 536 words]

Datasette: The Metropolitan Museum of Art (via) The Metropolitan Museum of Art publish a CSV file on GitHub with details of 464,360 items from their collection. I turned it into a searchable Datasette instance.

# 6:38 pm / art, museums, datasette

May 10, 2018

The synthetic voice of synthetic intelligence should sound synthetic. Successful spoofing of any kind destroys trust. When trust is gone, what remains becomes vicious fast.

Stewart Brand

# 4:56 am / ai, stewartbrand

The latest SQLite 3.8.7 alpha version is 50% faster than the 3.7.17 release from 16 months ago.  That is to say, it does 50% more work using the same number of CPU cycles. [...] The 50% faster number above is not about better query plans.  This is 50% faster at the low-level grunt work of moving bits on and off disk and search b-trees.  We have achieved this by incorporating hundreds of micro-optimizations.  Each micro-optimization might improve the performance by as little as 0.05%.  If we get one that improves performance by 0.25%, that is considered a huge win.  Each of these optimizations is unmeasurable on a real-world system (we have to use cachegrind to get repeatable run-times) but if you do enough of them, they add up.

D. Richard Hipp

# 5:15 am / performance, sqlite, d-richard-hipp

May 11, 2018

Pyre: Fast Type Checking for Python (via) Facebook’s alternative to mypy. “Pyre is designed to be highly parallel, optimizing for near-instant responses so that you get immediate feedback, even in a large codebase”. Like their Hack type checker for PHP, Pyre is implemented in OCaml.

# 5:47 pm / facebook, python, static-typing, mypy, ocaml

May 12, 2018

Datasette: Full-text search. I wrote some documentation for Datasette’s full-text search feature, which detects tables which have been configured to use the SQLite FTS module and adds a search input box and support for a _search= querystring parameter.

# 12:09 pm / full-text-search, search, sqlite, datasette

May 13, 2018

Sighting 1:19 PM — Grayish Baywing, in Distrito Federal, DF, AR
Grayish Baywing
Grayish Baywing

May 16, 2018

isomorphic-git (via) A pure-JavaScript implementation of the git protocol and underlying tools which works both server-side (Node.js) AND in the client, using an emulation of the fs API. Given the right CORS headers it can clone a GitHub repository over HTTPS right into your browser. Impressive.

# 8:54 pm / git, javascript, cors

How to number rows in MySQL. MySQL’s user variables can be used to add a “rank” or “row_number” column to a database query that shows the ranking of a row against a specific unique value. This means you can return the first N rows for any given column—for example, given a list of articles return just the first three tags for each article. I’ve recently found myself using this trick for a few different things—once you know it, chances to use it crop up surprisingly often.

# 9:06 pm / mysql

May 17, 2018

Django #8936: Add view (read-only) permission to admin (closed). Opened 10 years ago. Closed 15 hours ago. I apparently filed this issue during the first DjangoCon back in September 2008, when Adrian and Jacob mentioned on-stage that they would like to see a read-only permission for the Django Admin. Thanks to Olivier Dalang from Fiji and Petr Dlouhý from Prague it’s going to be a feature shipping in Django 2.1. Open source is a beautiful thing.

# 1:40 pm / django, django-admin, djangocon, open-source

sql.js Online SQL interpreter (via) This is fascinating: sql.js is a project that complies the whole of SQLite to JavaScript using Emscripten. The demo is an online SQL interpreter which lets you import an existing SQLite database from your filesystem and run queries against it directly in your browser.

# 9:28 pm / javascript, sqlite

sqlitebiter. Similar to my csvs-to-sqlite tool, but sqlitebiter handles “CSV/Excel/HTML/JSON/LTSV/Markdown/SQLite/SSV/TSV/Google-Sheets”. Most interestingly, it works against HTML pages—run “sqlitebiter -v url ’https://en.wikipedia.org/wiki/Comparison_of_firewalls’” and it will scrape that Wikipedia page and create a SQLite table for each of the HTML tables it finds there.

# 10:40 pm / csv, scraping, sqlite, datasette

May 19, 2018

Sighting 5:51 PM — Giant Bell Jelly, in Monterey Bay Area, CA, US
Giant Bell Jelly
Giant Bell Jelly
Sighting 12:08 PM – 12:13 PM — California Sea Lion, in Monterey Bay Area, CA, US
California Sea Lion
California Sea Lion
California Sea Lion
California Sea Lion
Sighting 12:12 PM — Southern Sea Otter, in Monterey Bay Area, CA, US
Southern Sea Otter
Southern Sea Otter

May 20, 2018

Datasette Facets

Datasette 0.22 is out with the most significant new feature I’ve added since the initial release: faceted browse.

[... 1,189 words]

Release datasette Datasette 0.22: Datasette Facets — An open source multi-tool for exploring and publishing data

May 21, 2018

The big thing I always get asked to find are dank dilapidated alleys, and New York City has, like, 5 alleys that look like that. Maybe four. You can’t film in three of them. So what it comes down to is there’s one alley left in New York, Cortlandt Alley, that everybody films in because it’s the last place. I try to stress to these directors in a polite way that New York is not a city of alleys. Boston is a city of alleys. Philadelphia has alleys. I don’t know anyone who uses the ‘old alleyway shortcut’ to go home. It doesn’t exist here. But that’s the movie you see.

Nick Carr

# 12:04 am / film, new-york

VirtualKNN for SpatiaLite. This looks amazing: a special virtual table shipped as part of SpatiaLite 4.4.0 which implements a fast, R-Tree backed mechanism for finding the X nearest points against a geospatial database table. There’s just one catch: it’s only available in 4.4.0, but the most recent “stable” release of SpatiaLite is 4.3.0a from September 2015 so the version you get if you install from apt-get or homebrew doesn’t yet have this functionality. I’d love to figure out a neat way to package and distribute this along with Datasette. I’d also like to figure out a clean way to ship a more recent version of SQLite than the one that is currently packaged with Python 3 (3.16.2, where the latest SQLite release is 3.23.1).

# 9:23 pm / geospatial, spatialite, sqlite

New in Django 2.0: Database instrumentation. I missed this previously. Django 2.0 shipped with one of my most-wanted features: the ability to easily instrument database calls (for logging and metrics) without having to monkey-patch or run an entirely new database backend. Can’t wait to try this out.

# 9:28 pm / django

May 22, 2018

Observable: Downloading and Embedding Notebooks (via) Big news from the Observable team: firstly, they’ve released the open source runtime for their notebooks which means you can now execute the code from a notebook independently of their hosted service. On top of that they’ve constructed an elegant way of exporting and executing notebooks (or specific notebook cells) as ES6 modules and as installable npm package tarballs.

# 12:14 pm / javascript, observable

Google is not trying to break the web by pushing for more HTTPS. Neither is Mozilla and neither are any of the other orgs saying "Hey, it would be good if traffic wasn't eavesdropped on or modified". This is fixing a deficiency in the web as it has stood for years.

Troy Hunt

# 4:17 pm / browsers, https, security, troy-hunt

Hynek Schlawack: Testing & Packaging (via) “How to ensure that your tests run code that you think they are running, and how to measure your coverage over multiple tox runs (in parallel!)”—Hynek makes a convincing argument for putting your packaged Python code in a src/ directory for ease of testing and coverage.

# 10:12 pm / packaging, python, testing, hynek-schlawack

2018 » May

MTWTFSS
 123456
78910111213
14151617181920
21222324252627
28293031