PTE Speech Grading
- #Education
Automated ML solution for English speech transcription and grading.
- Data Analysis
- Machine Learning

Impact
- The solution enables automatic, quick, and reliable grading of people’s speech, facilitating their English language studies and skill improvement.
- It supports scalability, making it capable of accommodating a higher number of users.
Services we provided
An automated ML solution for transcribing and grading English audio, utilizing MFCC features and Whisper’s architecture to enable efficient and scalable speech assessment.
Tech Stack
Python
Tensorflow
Huggingface
Pandas
NumPy
Flask
Challenges and Solutions
🧐 Challenges
- Procuring data for training the model to accurately grade input audio
- Developing and training the models for transcribing audio and grading speech
- Creating a pipeline for processing audio of varying length
💡 Solutions
Scraped diverse English learning data, and trained rubric-specific models. Architecture entailed:
- Extracting MFCC features from the audio.
- Encoding them using the encoder from OpenAi’s Whisper into an embedding.
- Decoding the embedding using several convolutional layers with residual connections and finally, several dense layers to obtain the final grade.
- Producing speech transcription using OpenAi’s base Whisper model.
User flow