Class WhisperJNI

java.lang.Object
io.github.givimad.whisperjni.WhisperJNI

public class WhisperJNI extends Object
The WhisperJNI class allows to use whisper.cpp thought the JNI.
Author:
Miguel Álvarez Díez - Initial contribution
  • Constructor Details

    • WhisperJNI

      public WhisperJNI()
  • Method Details

    • init

      public WhisperContext init(Path model) throws IOException
      Creates a new whisper context.
      Parameters:
      model - Path to the whisper ggml model file.
      Returns:
      A new WhisperContext.
      Throws:
      IOException - if model file is missing.
    • init

      public WhisperContext init(Path model, WhisperContextParams params) throws IOException
      Creates a new whisper context.
      Parameters:
      model - Path to the whisper ggml model file.
      params - WhisperContextParams params for context initialization.
      Returns:
      A new WhisperContext.
      Throws:
      IOException - if model file is missing.
    • initNoState

      public WhisperContext initNoState(Path model) throws IOException
      Creates a new whisper context without state.
      Parameters:
      model - Path to the whisper ggml model file.
      Returns:
      A new WhisperContext without state.
      Throws:
      IOException - if model file is missing.
    • initNoState

      public WhisperContext initNoState(Path model, WhisperContextParams params) throws IOException
      Creates a new whisper context without state.
      Parameters:
      model - Path to the whisper ggml model file.
      params - WhisperContextParams params for context initialization.
      Returns:
      A new WhisperContext without state.
      Throws:
      IOException - if model file is missing.
    • initState

      public WhisperState initState(WhisperContext context)
      Creates a new whisper.cpp state for the provided context.
      Parameters:
      context - the WhisperContext of this state.
      Returns:
      A new WhisperContext.
    • parseGrammar

      public WhisperGrammar parseGrammar(Path grammarPath) throws IOException
      Throws:
      IOException
    • parseGrammar

      public WhisperGrammar parseGrammar(String text) throws IOException
      Throws:
      IOException
    • initOpenVINO

      public void initOpenVINO(WhisperContext context, String device)
      Initializes OpenVino encoder.
      Parameters:
      context - a WhisperContext instance.
      device - the device name.
    • isMultilingual

      public boolean isMultilingual(WhisperContext context)
      Is multilingual.
      Parameters:
      context - the WhisperContext to check.
      Returns:
      true if model support multiple languages
    • full

      public int full(WhisperContext context, WhisperFullParams params, float[] samples, int numSamples)
      Run whisper.cpp full audio transcription.
      Parameters:
      context - the WhisperContext used to transcribe.
      params - a WhisperFullParams instance with the desired configuration.
      samples - the audio samples (f32 encoded samples with sample rate 16000).
      numSamples - the number of audio samples provided.
      Returns:
      a result code, values other than 0 indicates problems.
    • fullWithState

      public int fullWithState(WhisperContext context, WhisperState state, WhisperFullParams params, float[] samples, int numSamples)
      Run whisper.cpp full audio transcription.
      Parameters:
      context - the WhisperContext used to transcribe.
      state - the WhisperState used to transcribe.
      params - a WhisperFullParams instance with the desired configuration.
      samples - the audio samples (f32 encoded samples with sample rate 16000).
      numSamples - the number of audio samples provided.
      Returns:
      a result code, values other than 0 indicates problems.
    • fullNSegmentsFromState

      public int fullNSegmentsFromState(WhisperState state)
      Gets the available number of text segments.
      Parameters:
      state - the WhisperState used to transcribe
      Returns:
      available number of segments
    • fullNSegments

      public int fullNSegments(WhisperContext context)
      Gets the available number of text segments.
      Parameters:
      context - the WhisperContext used to transcribe
      Returns:
      available number of segments
    • fullGetSegmentTimestamp0

      public long fullGetSegmentTimestamp0(WhisperContext context, int index)
      Gets start timestamp of text segment by index.
      Parameters:
      context - a WhisperContext used to transcribe
      index - the segment index
      Returns:
      start timestamp of segment text, 800 -> 8s
    • fullGetSegmentTimestamp1

      public long fullGetSegmentTimestamp1(WhisperContext context, int index)
      Gets end timestamp of text segment by index.
      Parameters:
      context - a WhisperContext used to transcribe
      index - the segment index
      Returns:
      end timestamp of segment text, 1050 -> 10.5s
    • fullGetSegmentText

      public String fullGetSegmentText(WhisperContext context, int index)
      Gets text segment by index.
      Parameters:
      context - a WhisperContext used to transcribe
      index - the segment index
      Returns:
      the segment text
    • fullGetSegmentTimestamp0FromState

      public long fullGetSegmentTimestamp0FromState(WhisperState state, int index)
      Gets start timestamp of text segment by index.
      Parameters:
      state - a WhisperState used to transcribe
      index - the segment index
      Returns:
      start timestamp of segment text, 1050 -> 10.5s
    • fullGetSegmentTimestamp1FromState

      public long fullGetSegmentTimestamp1FromState(WhisperState state, int index)
      Gets end timestamp of text segment by index.
      Parameters:
      state - a WhisperState used to transcribe
      index - the segment index
      Returns:
      end timestamp of segment text, 1050 -> 10.5s
    • fullGetSegmentTextFromState

      public String fullGetSegmentTextFromState(WhisperState state, int index)
      Gets text segment by index.
      Parameters:
      state - a WhisperState used to transcribe
      index - the segment index
      Returns:
      the segment text
    • free

      public void free(WhisperContext context)
      Release context memory in native implementation.
      Parameters:
      context - the WhisperContext to release
    • free

      public void free(WhisperState state)
      Release state memory in native implementation.
      Parameters:
      state - the WhisperState to release
    • free

      public void free(WhisperGrammar grammar)
      Release grammar memory in native implementation.
      Parameters:
      grammar - the WhisperGrammar to release
    • getSystemInfo

      public String getSystemInfo()
      Get whisper.cpp system info stream, to check enabled features in whisper.
      Returns:
      the whisper.cpp system info stream.
    • loadLibrary

      public static void loadLibrary() throws IOException
      Register the native library, should be called at first.
      Throws:
      IOException - when unable to load the native library
    • loadLibrary

      public static void loadLibrary(WhisperJNI.LoadOptions options) throws IOException
      Register the native library, should be called at first.
      Parameters:
      options - instance of WhisperJNI.LoadOptions to customize library load.
      Throws:
      IOException - when unable to load the native library.
    • setLibraryLogger

      public static void setLibraryLogger(WhisperJNI.LibraryLogger logger)
      Proxy whisper.cpp logger. Should be called after loadLibrary().
      Parameters:
      logger - whisper.cpp log consumer, or null to disable the library default log to stderr.
    • log

      protected static void log(String text)
      Called from the cpp side of the library to proxy the whisper.cpp logs.
      Parameters:
      text - whisper.cpp log line.