项目简介
SenseVoice是具有音频理解能力的音频基础模型, 包括语音识别(ASR)、语种识别(LID)、语音情感识别(SER)和声学事件分类(AEC)或声学事件检测(AED)。
当前SenseVoice-small支持中、粤、英、日、韩语的多语言语音识别,情感识别和事件检测能力,具有极低的推理延迟。本项目提供python版的SenseVoice模型所需的onnx环境安装的与推理方式。
使用方式
安装
pip install sensevoice-onnx
pip install git+https://github.com/lovemefan/SenseVoice-python.git
使用
sensevoice --audio sensevoice/resource/asr_example_zh.wav
第一次使用会自动从huggingface下载,如果下载不下来,可以使用hugginface代理
export HF_ENDPOINT=https:
$env:HF_ENDPOINT = "https://hf-mirror.com"
或者非入侵方式使用环境变量
HF_ENDPOINT=https://hf-mirror.com sensevoice --audio sensevoice/resource/asr_example_zh.wav
Sense Voice 脚本参数设置
optional arguments:
-h,
-a ,
-dp ,
-d ,
Device
-n ,
Num threads
-l ,
结果
2024-07-19 07:22:40,643 INFO [sense_voice_ort_session.py:130] Loading model from /home/runner/work/SenseVoice-python/SenseVoice-python/sensevoice/resource/embedding.npy
2024-07-19 07:22:40,647 INFO [sense_voice_ort_session.py:133] Loading model /home/runner/work/SenseVoice-python/SenseVoice-python/sensevoice/resource/sense-voice-encoder.onnx
2024-07-19 07:22:42,755 INFO [sense_voice_ort_session.py:140] Loading /home/runner/work/SenseVoice-python/SenseVoice-python/sensevoice/resource/sense-voice-encoder.onnx takes 2.11 seconds
2024-07-19 07:22:42,786 INFO [sense_voice.py:76] Audio sensevoice/resource/asr_example_zh.wav is 5.58 seconds
2024-07-19 07:22:43,102 INFO [sense_voice.py:81] [0.61s - 5.53
s] <|NEUTRAL|><|Speech|><|woitn|>欢迎大家来体验达摩院推出的语音识别模型
2024-07-19 07:22:43,102 INFO [sense_voice.py:83] Decoder audio takes 0.31638407707214355 seconds
2024-07-19 07:22:43,103 INFO [sense_voice.py:84] The RTF is 0.05669965538927304.
https://github.com/lovemefan/SenseVoice-python
扫码加入技术交流群,备注「开发语言-城市-昵称」
合作请注明

关注「GitHubStore」公众号