Skip to content

fibo/SQL-tokenizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL tokenizer

Convert SQL statements into a list of tokens

Installation | Usage | License

Installation

With npm do

npm install sql-tokenizer

Usage

Create a tokenize function.

import { sqlTokenizer } from 'sql-tokenizer'

const tokenize = sqlTokenizer()

Turn SQL statement into tokens.

tokenize('select * from revenue')
// ['select', ' ', '*', ' ', 'from', ' ', 'revenue']

Quotes are handled properly.

tokenize(`select 'O''Reilly' as "book shelf"`)
// ['select', ' ', "'O''Reilly'", ' ', 'as', ' ', '"book shelf"']

Indentation is preserved.

tokenize(`
SELECT COUNT(*) AS num
FROM (
	SELECT *
	FROM mytable
	WHERE yyyymmdd=20170101
		AND country IN ('IT','US')
)
`)
// '\n',
// 'SELECT', ' ', 'COUNT', '(', '*', ')', ' ', 'AS', ' ', 'num', '\n',
// 'FROM', ' ', '(', '\n',
// '\t', 'SELECT', ' ', '*', '\n',
// '\t', 'FROM', ' ', 'mytable', '\n',
// '\t', 'WHERE', ' ', 'yyyymmdd', '=', '20170101', '\n',
// '\t\t', 'AND', ' ', 'country', ' ', 'IN', ' ', '(', "'IT'", ',', "'US'", ')', '\n',
// ')', '\n'

Special characters

The tokenizer function accepts an optional array of special characters, which defaults to sqlSpecialChars exported by sql-tokenizer and defined in specialChars.js.

By specialChar here it means a sequence of characters, excluding letters. So for example +, -, *, / operators are included in the list. Instead AND and OR are not included, because they are made of letters.

In case you need to add some custom special char to the list, you can do something like the following:

import { sqlSpecialChars, sqlTokenizer } = from 'sql-tokenizer'

const mySpecialChars = ['++', '??']

const tokenize = tokenizer(sqlSpecialChars.concat(mySpecialChars))

License

MIT

About

Convert SQL statements into a list of tokens

Resources

License

Contributing

Stars

Watchers

Forks