sul-dlss/speech-to-text issues and pull requests

#73 - Add Batch

Pull Request - State: open - Opened by edsu 2 months ago

#72 - update to supported ruff check

Pull Request - State: closed - Opened by peetucket 2 months ago

#71 - send an exception name to honeybadger

Pull Request - State: closed - Opened by peetucket 2 months ago

#70 - Align CI Python version

Pull Request - State: closed - Opened by edsu 2 months ago

#69 - Upload log message

Pull Request - State: closed - Opened by edsu 2 months ago

#68 - Use a nvidia/cuda base image

Pull Request - State: closed - Opened by edsu 2 months ago - 3 comments

#67 - Remove ffprobe3

Pull Request - State: closed - Opened by edsu 2 months ago

#66 - cast a wider net for .env files to ignore (e.g. .env.qa)

Pull Request - State: closed - Opened by jmartin-sul 2 months ago

#65 - add speech-to-text to weekly dependency updates

Issue - State: open - Opened by jmartin-sul 2 months ago
Labels: prod-blocker

#64 - Investigate WhisperX performance

Issue - State: open - Opened by alundgard 2 months ago - 3 comments

#63 - Fix exception handling

Pull Request - State: closed - Opened by edsu 3 months ago

#62 - Docker container exiting due to segfault

Issue - State: closed - Opened by edsu 3 months ago - 3 comments

#61 - spike: can we notify Honeybadger when the speech-to-text container exits with a non-zero error code?

Issue - State: open - Opened by jmartin-sul 3 months ago - 1 comment
Labels: implementation question

#60 - UnboundLocalError: cannot access local variable 'runs' where it is not associated with a value

Issue - State: closed - Opened by edsu 3 months ago
Labels: bug

#59 - RuntimeError: Failed to load audio

Issue - State: closed - Opened by edsu 3 months ago - 5 comments
Labels: bug

#58 - CI: replace current action for ruff with officially supported version

Issue - State: closed - Opened by jmartin-sul 3 months ago
Labels: tech debt

#57 - Added mypy and ruff details

Pull Request - State: closed - Opened by edsu 3 months ago

#56 - Set word_timestamps on transcription

Pull Request - State: closed - Opened by edsu 3 months ago - 1 comment

#55 - add honeybadger alerts on exception

Pull Request - State: closed - Opened by peetucket 3 months ago - 2 comments

#54 - Allow media specific options

Pull Request - State: closed - Opened by edsu 3 months ago

#53 - Amara Enterprise editing tool API testing

Issue - State: open - Opened by laurensorensen 3 months ago - 4 comments

#52 - Stanza editing tool API testing

Issue - State: open - Opened by laurensorensen 3 months ago - 11 comments

#51 - File specific language specification

Issue - State: closed - Opened by edsu 3 months ago

#50 - Check media and log attributes

Pull Request - State: closed - Opened by edsu 3 months ago - 1 comment

#48 - speech_to_text.py should check to see whether a file contains audio before processing it with whisper

Issue - State: closed - Opened by jmartin-sul 3 months ago
Labels: question

#47 - Whisper output quality: test transcription of video where most of speech is German, but opening 30ish seconds of speech is English

Issue - State: closed - Opened by jmartin-sul 3 months ago - 2 comments

#46 - Build and deploy speech-to-text

Issue - State: open - Opened by edsu 3 months ago - 3 comments
Labels: prod-blocker

#45 - When using Whisper's auto-detected language, insert that language into the Cocina

Issue - State: closed - Opened by andrewjbtw 3 months ago - 2 comments

#42 - [HOLD] only need vtt and txt files in output

Pull Request - State: closed - Opened by peetucket 3 months ago - 3 comments

#41 - Whisper should only produce .txt and .vtt files

Issue - State: closed - Opened by peetucket 3 months ago - 1 comment

#40 - Add type checking

Pull Request - State: closed - Opened by edsu 4 months ago

#39 - Mocked AWS & Github Action

Pull Request - State: closed - Opened by edsu 4 months ago

#38 - Stanza API docs review

Issue - State: closed - Opened by laurensorensen 4 months ago - 7 comments
Labels: question

#37 - API docs overview -- how does the file return to us via the Amara API?

Issue - State: closed - Opened by laurensorensen 4 months ago - 1 comment

#36 - finish up CI configuration

Issue - State: closed - Opened by jmartin-sul 4 months ago
Labels: blocked

#35 - Investigate Whisper.writer parameters

Issue - State: closed - Opened by alundgard 4 months ago - 4 comments

#34 - Simplify bucket and job message

Pull Request - State: closed - Opened by edsu 4 months ago

#33 - Adjust locations for AWS Whisper Container

Issue - State: closed - Opened by peetucket 4 months ago - 1 comment

#32 - write DevOpsDocs for speech-to-text infrastructure

Issue - State: open - Opened by jmartin-sul 4 months ago - 1 comment
Labels: blocked, prod-blocker

#31 - Log media size and duration

Issue - State: closed - Opened by edsu 4 months ago
Labels: enhancement

#30 - Added logging and removed caching

Pull Request - State: closed - Opened by edsu 4 months ago

#29 - Improve logging

Issue - State: closed - Opened by edsu 4 months ago
Labels: enhancement

#28 - ExpiredToken when calling the ReceiveMessage

Issue - State: closed - Opened by edsu 4 months ago
Labels: bug

#27 - Add Honeybadger

Issue - State: closed - Opened by edsu 4 months ago

#26 - Add new Turbo model

Pull Request - State: closed - Opened by edsu 4 months ago

#25 - minor enhancements: return technical metadata, allow job ID specification for testing

Pull Request - State: closed - Opened by jmartin-sul 4 months ago - 4 comments
Labels: hacktoberfest-accepted

#24 - speech-to-text worker sends back some basic technical metadata in the body of the done message it queues

Issue - State: closed - Opened by jmartin-sul 4 months ago

#23 - should we automatically update the model files that whisper uses? if so, at what frequency and with what mechanism?

Issue - State: open - Opened by jmartin-sul 5 months ago - 6 comments
Labels: question, implementation question

#22 - small readme and comment touchups

Pull Request - State: closed - Opened by jmartin-sul 5 months ago - 1 comment
Labels: hacktoberfest-accepted

#21 - DONE message should include output file

Issue - State: closed - Opened by edsu 5 months ago - 1 comment

#20 - TODO job should just include ID

Issue - State: closed - Opened by edsu 5 months ago - 1 comment

#19 - investigate expected cost of cloud deployment, and as well as possible approaches for measuring cost

Issue - State: open - Opened by jmartin-sul 5 months ago - 2 comments

#12 - Run tests as Github Action

Issue - State: closed - Opened by edsu 5 months ago

#9 - Add initial Docker container

Pull Request - State: closed - Opened by edsu 5 months ago

#8 - Finish skeleton common-accessioning robot and workflow def for... `captionWF`? `speechToTextWF`? [final name TBD]

Issue - State: closed - Opened by jmartin-sul 5 months ago - 1 comment
Labels: blocked

#7 - Productionize speech-to-text pipeline

Issue - State: open - Opened by jmartin-sul 5 months ago - 1 comment
Labels: question, blocked, prod-blocker

#6 - provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files.

Issue - State: closed - Opened by jmartin-sul 5 months ago - 3 comments

#5 - Questions surrounding speech_to_text_generation_service (e.g., do we need a speech_to_text_request_service REST API?)

Issue - State: closed - Opened by jmartin-sul 5 months ago - 1 comment
Labels: blocked

#4 - [investigate/prototype] speech_to_text_generation_service approach 2: Explore AWS SageMaker

Issue - State: open - Opened by jmartin-sul 5 months ago - 3 comments

#3 - [investigate/prototype] speech_to_text_generation_service approach 1: Define a Docker container for running open source Whisper in a container that we define and for which we manage deployment (lives in this repo?)

Issue - State: closed - Opened by jmartin-sul 5 months ago - 2 comments

#2 - Choose an approach for producing speech-to-text output, given media file input (let's call this speech_to_text_generation_service for now?)

Issue - State: closed - Opened by jmartin-sul 5 months ago - 1 comment
Labels: blocked

#1 - [EPIC] Prototype workflow for generating and accessioning speech-to-text extraction

Issue - State: open - Opened by jmartin-sul 5 months ago

GitHub / sul-dlss/speech-to-text issues and pull requests