Skip to content

The Language Navigator is both a dataset and a tool designed to help people explore and understand the relationships between different languages. It provides a user-friendly interface for visualizing language families, dialects, and other linguistic features and how they are situated across the globe.

License

Notifications You must be signed in to change notification settings

Translation-Commons/lang-nav

Repository files navigation

Language Navigator

This repository contain a dataset about the world's languages and language-like objects as well as a website framework to visualize the information.

Preview of the website

There are multiple ways to visualize the data

Card List Details Hierarchy Table Reports
Card List Details Hierarchy Table Reports

Project Overview

Motivation

This website was put together to give an overview of the world's languages and language-related concepts. Similar to Ethnologue & Glottolog, the main differences of this are to:

  • Free & open to all consumers.
  • Show all data, even contested language definitions. Making sure to put contested data in context.
  • Provide actionable insights -- make sure the data is clear enough that consumers can come to this page to get answers.

Common questions people will come to this resource for are:

  • What's the language code for XXXX?
  • What languages are used in country YYYY?
  • What are the top languages in the world given some criteria.

Tech Stack

  • Frontend: The website is rendered in Javascript, particularly React using Typescript.
  • Backend: The framework of the website Node and Vite.
  • Data: Data files are written in Tab-separated-value format (tsv).

Partners

We've partnered with various organizations to get data and to provide data to. UNESCO's World Atlas of Languages, Digital Language Vitality, and Unicode CLDR

Data

The data comes from multiple sources, primarily CLDR, Ethnologue, and Glottolog.

Development Instructions

In order to generate the website on an internal server, follow these instructions.

  1. Install Node Project Manager, see the official Node documentation for install
  2. Download the repository to your computer -- go to that folder when you are done
  3. Run npm install to install relevant Node and Vite packageas
  4. Run npm run dev to start the server with some dev options
    1. or npm run build for the public version
  5. Depending on what port is used, the website can now be accessed using a local browser at a URL like http://localhost:5173/

In order to push the changes to the deployed website (github pages site), follow these instructions.

  1. Run npm run deploy to deploy the changes. This will
    1. Build the app into the dist/ folder.
    2. Push the dist/ contents to the gh-pages branch.

Initialization

This is how we created the project originally -- you should not need to run these, but its for background.

  1. Initalize the project using vite npm create vite@latest
  2. Choose lang-nav as project name. Then React + TypeScript
  3. Change into the lang-nav directory and run npm install
  4. Setup the linter
  5. Initialize npx eslint --init
  6. Choose options: what: javascript, use: problems, modules: esm, framework: react, typescript: yes, runs on: browser
  7. npm install --save-dev prettier eslint-config-prettier eslint-plugin-prettier
  8. More magic to get it to run... I had to install ESLint on my IDE (VSCode)
  9. Some plugins were added after the this library was started like eslint-plugin-import
  10. Import other lirbaries
    1. npm install react-router-dom
  11. Start npm run dev

How to contribute

Adding or update Data

There's a lot of data shown here but there always could be more. The main way to add or update data is to go to the Tab-separated files directly. They are all in the public/data directory.

If you want to add entries or update values, you can just edit the existing TSVs.

However, if you want to add a lot more data or add contested data it may be better to make new TSVs and then update the website to use those instead.

Adding a Feature

TODO add a guide

Functionality

Here's a list of planned functionality. Completed functions are checked off.

  • Language-adjacent objects
    • Languages
      • Core attributes
      • ISO parent/child connections
      • Language families
      • Glottolog
      • Digital Support details
      • Vitality details
      • Keyboard availability details
    • Territories
      • Countries & Dependencies
      • Continents & other regions
    • Locales (languages + territories + potentially other specificity)
      • Basic data
      • Computed regional locales
      • Population estimate sources
    • Writing Systems
      • Basic data
      • Relationship w/ other writing systems (containment, lineage)
    • Language Variants / IANA tags
    • Censuses
      • Regular censuses
      • Include citation information
      • Continue importing new censuses
      • Convert other imported datasets into census-like objects
  • Views
    • Cards
    • Details
    • Hierarchy
    • Table
    • Map
    • Reports
      • Language name overlap
      • Invalid languages
      • Locales that should be added
      • Metrics on the data we have
  • Interactivity
    • Search
      • By Code
      • By Name, Endonym
      • Highlight search
      • For Hierarchy
      • Using typeahead
      • When few results are shown, suggest alternatives
    • Filter
      • By Scope
      • By Country
      • Integrate in search bar
    • Hovercard & Tooltips
      • Related objects
      • Field explanations
    • Sort By: Population, Name, Code
    • Limit
      • Pagination
    • Visual options
      • Change locale separator (_ or -)
    • Selection
    • Export
  • Manage data sources
    • Show results based on different definitions of what a language is
      • ISO, Glottolog, CLDR, All
      • Highlight language codes in each
    • Add a better guide for different kinds of users
  • About Page
    • Introduction, how to use
    • Acknowledgements
  • Future ideas
    • Database-powered backend
    • Feedback mechanisms
    • Metrics

License

About

The Language Navigator is both a dataset and a tool designed to help people explore and understand the relationships between different languages. It provides a user-friendly interface for visualizing language families, dialects, and other linguistic features and how they are situated across the globe.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 9

Languages