Releases: k2-fsa/sherpa-onnx
Releases · k2-fsa/sherpa-onnx
v1.10.30
What's Changed
- Fix building node-addon for Windows x86. by @csukuangfj in #1469
- Begin to support https://github.com/usefulsensors/moonshine by @csukuangfj in #1470
- Publish pre-built JNI libs for Linux aarch64 by @csukuangfj in #1472
- Add C++ runtime and Python APIs for Moonshine models by @csukuangfj in #1473
- Add Kotlin and Java API for Moonshine models by @csukuangfj in #1474
- Add C and C++ API for Moonshine models by @csukuangfj in #1476
- Add Swift API for Moonshine models. by @csukuangfj in #1477
- Add Go API examples for adding punctuations to text. by @csukuangfj in #1478
- Add Go API for Moonshine models by @csukuangfj in #1479
- Add JavaScript API for Moonshine models by @csukuangfj in #1480
- Add Dart API for Moonshine models. by @csukuangfj in #1481
- Add Pascal API for Moonshine models by @csukuangfj in #1482
- Add C# API for Moonshine models. by @csukuangfj in #1483
- Release v1.10.30 by @csukuangfj in #1484
Full Changelog: v1.10.29...v1.10.30
v1.10.29
What's Changed
- Upload speaker embedding models to huggingface by @csukuangfj in #1428
- "Speaker identification" is repeat! by @semxum in #1431
- Add Go API for offline punctuation models by @csukuangfj in #1434
- updated onnxruntime-linux-aarch64.cmake so that libonnxruntime.so can… by @shawl336 in #1436
- Support https://huggingface.co/Revai/reverb-diarization-v1 by @csukuangfj in #1437
- fix "log10" compile error by import CMATH lib by @Zazzle516 in #1438
- add more models for speaker diarization by @csukuangfj in #1440
- Add Java API example for hotwords. by @csukuangfj in #1442
- update java for hotword jar by @YeyuchenBa in #1444
- add java android demo by @JameWade in #1454
- Add C++ API for streaming ASR. by @csukuangfj in #1455
- Add C++ API for non-streaming ASR by @csukuangfj in #1456
- Fix style issues by @csukuangfj in #1458
- Handle NaN embeddings in speaker diarization. by @csukuangfj in #1461
- Add speaker identification with VAD and non-streaming ASR using ALSA by @Peakyxh in #1463
- Support GigaAM CTC models for Russian ASR by @csukuangfj in #1464
- Add GigaAM NeMo transducer model for Russian ASR by @csukuangfj in #1467
- Release v1.10.29 by @csukuangfj in #1468
New Contributors
- @semxum made their first contribution in #1431
- @Zazzle516 made their first contribution in #1438
- @YeyuchenBa made their first contribution in #1444
- @JameWade made their first contribution in #1454
- @Peakyxh made their first contribution in #1463
Full Changelog: v1.10.28...v1.10.29
v1.10.28
What's Changed
- Fix swift example for generating subtitles. by @csukuangfj in #1362
- allow more online models to load tokens file from the memory by @shawl336 in #1352
- Fix CI errors introduced by supporting loading keywords from buffers by @csukuangfj in #1366
- Update online_model.dart by @flutter-painter in #1375
- Fix running MeloTTS models on GPU. by @csukuangfj in #1379
- Support Parakeet models from NeMo by @csukuangfj in #1381
- Export Pyannote speaker segmentation models to onnx by @csukuangfj in #1382
- Support Agglomerative clustering. by @csukuangfj in #1384
- Add Python API for clustering by @csukuangfj in #1385
- support whisper turbo by @csukuangfj in #1390
- Potentially fixes segmentation fault in online decoding with hotwords by @vsd-vector in #1393
- Speaker diarization example with onnxruntime Python API by @csukuangfj in #1395
- C++ API for speaker diarization by @csukuangfj in #1396
- Python API for speaker diarization. by @csukuangfj in #1400
- C API for speaker diarization by @csukuangfj in #1402
- docs(nodejs-addon-examples): add guide for pnpm user by @YogiLiu in #1401
- Go API for speaker diarization by @csukuangfj in #1403
- Swift API for speaker diarization by @csukuangfj in #1404
- Update readme to include more external projects using sherpa-onnx by @csukuangfj in #1405
- C# API for speaker diarization by @csukuangfj in #1407
- JavaScript API (node-addon) for speaker diarization by @csukuangfj in #1408
- WebAssembly exmaple for speaker diarization by @csukuangfj in #1411
- Handle audio files less than 10s for speaker diarization. by @csukuangfj in #1412
- JavaScript API with WebAssembly for speaker diarization by @csukuangfj in #1414
- Kotlin API for speaker diarization by @csukuangfj in #1415
- Java API for speaker diarization by @csukuangfj in #1416
- Dart API for speaker diarization by @csukuangfj in #1418
- Pascal API for speaker diarization by @csukuangfj in #1420
- Android JNI support for speaker diarization by @csukuangfj in #1421
- Android demo for speaker diarization by @csukuangfj in #1423
- Release v1.10.28 by @csukuangfj in #1424
New Contributors
- @flutter-painter made their first contribution in #1375
- @YogiLiu made their first contribution in #1401
Full Changelog: v1.10.27...v1.10.28
speaker-segmentation-models
v1.10.27
What's Changed
- Fix sherpa_onnx.go by @lllwan in #1353
- Support passing utf-8 strings from JavaScript to C++. by @csukuangfj in #1355
- Fix building flutter examples by @csukuangfj in #1356
- Add non-streaming ONNX models for Russian ASR by @csukuangfj in #1358
- Release v1.10.27 by @csukuangfj in #1359
New Contributors
Full Changelog: v1.10.26...v1.10.27
v1.10.26
What's Changed
- Add links to projects using sherpa-onnx. by @csukuangfj in #1345
- Support lang/emotion/event results from SenseVoice in Swift API. by @csukuangfj in #1346
- Support specifying max speech duration for VAD. by @csukuangfj in #1348
- Add APIs about max speech duration in VAD for various programming languages by @csukuangfj in #1349
- Release v1.10.26 by @csukuangfj in #1350
Full Changelog: v1.10.25...v1.10.26
v1.10.25
What's Changed
- Fix releasing dart packages. by @csukuangfj in #1317
- Throw error instead exit on fail to read wav in java by @RGdevz in #1323
- Re-implement LM rescore for online transducer by @SilverSulfide in #1231
- Fixed the C api calls and created the TTS project file by @twodawg in #1324
- Build websocket related binaries for embedded systems. by @csukuangfj in #1327
- fix wasm app for streaming paraformer by @csukuangfj in #1328
- Fix vad.Flush() by @csukuangfj in #1329
- Fix typos by @csukuangfj in #1330
- Add Python binding for online punctuation models by @yaochie in #1312
- Fix building by @csukuangfj in #1331
- Preserve previous result as context for next segment by @vsd-vector in #1335
- Fix computing features for CED audio tagging models. by @csukuangfj in #1341
- re-pull-request allow tokens and hotwords be loaded from buffered string driectly by @shawl336 in #1339
- Fix building by @csukuangfj in #1343
- Release v1.10.25 by @csukuangfj in #1344
New Contributors
- @RGdevz made their first contribution in #1323
- @twodawg made their first contribution in #1324
- @yaochie made their first contribution in #1312
- @shawl336 made their first contribution in #1339
Full Changelog: v1.10.24...v1.10.25
v1.10.24
Release v1.10.24 (#1309)
v1.10.23
What's Changed
- flutter: add lang, emotion, event to OfflineRecognizerResult by @eschmidbauer in #1268
- Use a separate thread to initialize models for lazarus examples. by @csukuangfj in #1270
- Object pascal examples for recording and playing audio with portaudio. by @csukuangfj in #1271
- Text to speech API for Object Pascal. by @csukuangfj in #1273
- update kotlin api for better release native object and add user-frien… by @fbzhong in #1275
- Provide models for mobile-only platforms by fixing batch size to 1 by @csukuangfj in #1276
- Update wave-reader.cc by @diyism in #1278
- Set batch size to 1 for more streaming ASR models by @csukuangfj in #1280
- Add WebAssembly for VAD by @csukuangfj in #1281
- WebAssembly example for VAD + Non-streaming ASR by @csukuangfj in #1284
- Add VAD and keyword spotting for the Node package with WebAssembly by @csukuangfj in #1286
New Contributors
- @eschmidbauer made their first contribution in #1268
- @diyism made their first contribution in #1278
Full Changelog: v1.10.22...v1.10.23
v1.10.22
What's Changed
- Exclude .DS_Store files from flutter tts assets by @csukuangfj in #1238
- Add Pascal API for reading wave files by @csukuangfj in #1243
- Pascal API for streaming ASR by @csukuangfj in #1246
- Pascal API for non-streaming ASR by @csukuangfj in #1247
- Pascal API for VAD by @csukuangfj in #1249
- Update offline-recognizer.cc by @iprovalo in #1253
- Add more C API examples by @zhu-han in #1255
- Add emotion, event of SenseVoice. by @fbzhong in #1257
- Support reading multi-channel wave files with 8/16/32-bit encoded samples by @csukuangfj in #1258
- Enable IPO only for Release build. by @csukuangfj in #1261
- Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR by @csukuangfj in #1251
- chore: update online-stream.h by @eltociear in #1264
- Build generating subtitles APPs for more models by @csukuangfj in #1265
- Fix looking up OOVs in lexicon.txt for MeloTTS models. by @csukuangfj in #1266
- Release v1.10.22 by @csukuangfj in #1267
New Contributors
Full Changelog: v1.10.21...v1.10.22