Tucan AI
AI-driven transcription software to increase efficiency correcting transcripts.
My role
Product Designer
Design Researcher
Brand Designer
Web Designer
Team
Lukas Rintelen, Co-Founder & CEO
Florian Polak, Co-Founder & CEO
Michael Schramm, Co-Founder & CTO
Multiple Developers
Tools
Sketch
Figma
Adobe CC
Miro
Typeform
Timeline
2020
Description
AI-driven transcription software to increase efficiency correcting transcripts.
Context
I joined RecTag (formerly Tucan) in April 2019 as a Founding UI/UX Designer. My main role was to create the design language and the MVP for an early-stage startup.
At its core, the idea was to build products that collect and train audio data. We started working on a social media app based on voice called RecTag (an early version of Clubhouse). Later, we transitioned to a playful podcast discovery platform offering transcripts - RecTag Podcasting. Eventually, we launched a transcription platform that summarizes meetings - Tucan (later Tucan AI).
In this case study, I will highlight key learnings and decisions made at each pivot point. Additionally, I will discuss the latest process of creating Tucan, the most recent pivot.
Background Story
In early 2020, Lukas and Florian approached me while we were working together at the startup Careship in Berlin. At that time, I was a Product Designer and Design Researcher.
Their pitch was for a mobile app designed to share opinions and experiences within local neighborhoods. We decided to work together, and I developed three design directions.
Research Insights
Two months later, we started with the Double Diamond Design Process (Discover, Define, Design, and Deliver).
To understand user needs and pain points in social media apps and the content-sharing process, I conducted a survey with questions focus on uncovering the motivations behind a user's decision to use social media and share content.
After surveying 38 people and analyzing their responses, I identified the factors that motivated users or held them to share on social media:
Connecting with Friends, Family and other People
Seeking validation and approval
Engaging with and sharing entertaining content
Fear of Judgement
Key insights from survey response
- 67% of participants said that seeking validation was one reason they posted content, while ~28% said that fear of judgment prevented them from sharing.
- In response to the question, "What are the key motivators for deciding whether to post on social media?" 81% said it was to engage with and share entertaining content.
- 78% of participants expressed a desire to connect with friends and family.
The most interesting result of this survey for us was the discrepancy between the 67% who seek validation and the 28% who are deterred from sharing content due to fear of judgment.
Early App Concept & Creation
After analyzing the survey results, it became clear that one user frustration was the hesitation to share on social media due to the fear of judgment.
To gain a deeper understanding of anonymized communities, we conducted a benchmark analysis. Reddit and Twitter were at the forefront of our analysis.
Key takeaways
- User Identification → Implement Pseudonyms
- Pseudonymous Communities → Allow users to share anonymously
- Upvoting and Downvoting → Enable users to curate content
We decided to build all three concepts, based on research we made, the survey and proof of concept from our benchmark analysis.
What is RecTag?
RecTag is a social media app based on voice built from the ground up. RecTag primarily purpose is connecting user’s in a city by sharing Tags (Thread Pins) based on recording a voice message. Tags Conversations we’re sorted by interactions. The profile was showcasing how engaged user were without being judged.
RecTag is a voice-based social media app designed to connect users within a city by sharing Tags (Thread Pins) through recorded voice messages. Tags allowed users to comment and engage, they were sorted by the level of interaction. The profile aimed to showcase user engagement without the fear of judgment.
Development & Traction
The app was launched in early 2020, with the first users sharing audio messages and training our model. We reached 1k MAU within four months.
Features built:
- Recording voice and receiving a transcript to share at your location.
- Exploring and listening to Tags on the map nearby.
- Commenting on Tags to engage with other users.
- A profile section with followers and rankings to add a gaming element.
Challenges & Learnings
Feature Driven Growth Does Not Address Fundamental Issues
Our user base plateaued at 500, leading us to re-engage with our users to better understand their issues and needs. We developed several new features in the hope that they would spur growth, but this was not the case.
Think Outside the Box to Get Feedback and Signups
To gather feedback and signups, we took a special approach after our app had been live for two months. We decided to meet users where they were, in their daily routines. For two weeks, we visited a park daily with a sign that offered "Free Ice Cream for Feedback and Signup to Our App." This approach was a small success, resulting in 300 new users.
Learn from Failures
A notable success was improving our speech-to-text algorithm's accuracy from 61% to 79%. We also realized the importance of an idea's ability to address big user pains and serve a value.
Creating a Community is Challenging
Finding ambassadors to share exclusive content on the platform is ideal. We managed to attract some well-known individuals to the platform (mainly comedians), but we never achieved our growth targets or created a significant bang.
Pivot to RecTag Podcasting
We got accepted into the Axel Springer and Porsche Accelerator, the largest accelerator program in Germany. The 12 week program and its mentorship led us to pivot towards creating a podcast discovery platform.
The concept was to develop a playful podcast discovery platform that connects users with meaningful content and creators with their audiences. Strategically the pivot was based on the need for more audio data to train our algorithm, considering the average podcast lasts 30 minutes versus the previous product's average Tag length of 30 seconds.
What is RecTag Podcast Discovery?
The first version was launched on both the Google Play store and AppStore. Drawing from our experiences with RecTag 1.0, we built a podcast discovery feature. The app's core was the Explorer where users could swipe through and play podcast trailers—a concept we took from swiping through Tags (RecTag 1.0). Besides discovery, we added curation features like Like and Dislike. To increase engagement, listeners could comment directly at specific points in an episode, providing creators with targeted feedback on their content.
Design Process & Research
Discover
- RecTag goals helped us to create a common understanding of what we wanted to achieve
- Competitor analysis allowed us to understand market gaps
- Surveying podcast listeners and creators allowed us to understand user needs and pains
- Listeners were split into two groups 1. Power Listeners 2. Occasional Listeners
- Listeners wanted to discover niche podcast which they had struggle finding on other platforms.
- Podcast discovery was mainly done using Top Podcast Charts.
- Creators had difficulties to promote their podcast and gain new listeners
- Creators had barriers to get quick feedback from listeners to iterate
- Mapping out podcast discovering allowed us to understand how users find new content
Define
- Internal meetings helped everyone in the team to build towards one direction
- We mapped out the opportunities to build vital USPs
Design
- I started to sketch and ideate ideas. Create a prototype and went to users to understand how they would use them
- User Testings with click prototype allowed us to get feedback on ideas/sketches
- Internal Design Reviews allowed us to keep all stakeholders aligned
- User Flows helped us to understand and define how users use the app
Deliver
- Our development were close partners in the creation of the user experience
- Handing over the design files and user flows via Figma
What was different to RecTag 1.0?
- We increase the average time spent on the platform by 600%. This was crucial for training our algorithm. Podcast is next to audio books the longest audio format.
- We reached 35k MAU four months after we launched in June 2020.
- We created a platform for small Creators to reach new listener audiences.
- Listeners compared to other platform discovered new podcast with a swipe gesture in minutes.
- We built a responsive app and SEO was our new best friend, tools that were not available to us with a native mobile app. This accelerated our reach enormously.
- We started a exclusive podcast series with Porsche as sponsor. This led to thousands of new users, a strategy we followed before with RecTag 1.0 but our execution was more successful.
Challenges & Learnings
Content Discovery
With hundreds of podcasts on a single platform, it was a challenge to ideate patterns that allowed users on the go to discover new podcasts.
Maintaining a Clean Interface
Building a podcast platform with community features resulted in a large amount of features that had to fit on a mobile 375px by 667px screen. I designed mobile first and understanding whats most important for users was difficult to make design decisions. Pen and paper plus Marvel were my friends.
Building Social Features
RecTags key USPs laid in offering features that allowed users to interact with podcasts and their Creators. I identified multiple opportunities but had difficulties to decide which features needed to be included. Interviews and mapping out whats important to users helped.
Monetization is key
Placing RecTag as Podcast creator helped us to launch two Podcast that created first profits. This was the only tested and executed monetisation strategy we had approach. We evaluated if we can charge for a premium version of the platform but decided against it.
Finding Great Creators
None of our team had direct contact to podcast or creators industry before. What we did to accomplish this challenge is to host comedy nights and other events were Creators able have live performances and collaborate with other Creators. This positioned us in the Expert / Host position.
Licensing Slows Growth
We had signed a deal with Porsche but for a very long time we were not able to communicate and promote this project. Resulting in big frustrations and wasted time. This process took multiple months.
Why Pivot again?
The team and investors were not satisfied with the traction and growth we had achieved. The barriers to entering a market, finding, and building a community of creators, were underestimated in terms of effort and time. The initial business decision was made together with the team at APX.
At this point, we were able to transcribe audio content to text with a very high accuracy level of 88%, which was top-notch compared to the competition.
Personal Context
I handed in my resignation four months before the end of this project. I wanted to move on to the next challenge. My motivation was high to leave with a good impression and to conclude with a final project that reflected all our learnings.
New Project Goal
How Can We Create a Business Case With Our High-Accuracy Speech-to-Text Algorithm?
The market has introduced easy-to-use platforms for transcribing audio to text, but we believe there is a lack of services that also trains the algorithm. Previous target groups, such as podcasters, needed to create transcripts to increase their reach but were lacking platforms that could generate accurate transcripts. What is the right approach for RecTag to bridge this gap?
Fast Transcript Correction in the Transcript Creation Process
We’re increasing the accuracy of 'useless transcripts that needed hours to be reworked' by assisting podcasters in the most time-consuming part of the process. This not only helps to train our algorithm but also results in a short-term increase in accuracy.
'Fast Transcript Editing' as a Competitive Advantage
Offering suggestions to support podcasters in correcting transcripts is a service that other transcription platforms are not providing.
Increased Algorithm Accuracy Level
We estimate that the ability to 'correct' more transcripts could lead to a 4-7% increase in accuracy and serve as an easy-to-use tool for transcription.
Challenge
“How can we gain a competitive advantage by making the transcription process more efficient and accurate?”
How can we make it as easy as possible for users correct transcripts while feeding our algorithm with feedback to improve accuracy? - APX Stakeholder
Creating transcripts is very time consuming - we need to help users to have tools to speed up the process. - RecTag Team Member
In the past we manage to get a lot of audio data, but to create very high accuracy levels you need humans to interact with that data. I wonder how we can accelerate that process while creating value for our users. - RecTag Team Member
Grammarly makes giving feedback on suggested text corrections very seamless.
You can correct pages of long text within minutes.
Transcription is well needed by journalist, interviewers, media creators, lawyers, researchers or medical workers.
Creating transcripts isn't just about refining the final output of your content.
It's there to eliminate time spent on editing and to allow you to focus on audio/video creation
How do we get from left to right?
Innovative Process • Double Diamond
Inspiration
Suggestions
Platforms like Grammarly have optimized their text improvement and suggestion flow for years. I was inspired by the ease of use and speed. Review two pages in just minutes.
Editing While Listening Experiences
Competing platforms like Trint and Happy Scribe faced similar challenges in creating 'edit while listening' experiences in the context of transcripts. Both platforms were packed with features, which overwhelmed the users we spoke to.
Simplicity of design
Whereby has nothing to do with editing text but I was inspired by the simplicity as well as the human touch in their UX writing.
Identified opportunities
Suggesting Improvements
During busy weekdays, we observe that content creators and professionals lack the energy to correct 70% of a transcribed text they receive. Many decide to 'work around the transcript' or choose to 'write them from scratch' instead. We believe that RecTag can reduce the time spent on transcription corrections.
Speaker Detection
Since we started working with the podcast format we have seen how challenging it can be to edit audio for a podcast, especially distinguishing between two voices. However, as most podcasts have two speakers, we must offer a feature that detects speakers but requires user confirmation.
Inventions
Detect errors and make suggestions
Key Insights
We found that users don’t have the time to spent hours on editing a transcript, they rather not offer one or hire a agency to do it.
Solution
Inspired by Grammarly’s approach, introduced ‘Alternatives’ and ‘Recents' to give users the freedom to select from a list of alternatives and recents to speed up the correction process. These changes were marked in the text minutes after the audio has been uploaded.
Expected Results
- Faster Transcript Completion Rate. By providing a top in class algorithm that is easy accessible and user-friendly.
- Faster Training of Algorithm. By making it seamless to give feedback to a user on the transcript.
- Boost user satisfaction. By offering a all-round solution for transcript creation and editing, the perception of transcribing text takes hours to create value.
Stinky Fish
A huge stinky fish is obviously related to technology: the more accurate results created by the algorithm are the less correction needed by a user. Especially in for different languages and slangs this can be very challenging.
User Feedback
Testing was limited due to lack of technology. Testers became excited about using the prototype and how seamless the selection from alternatives was. For the test, we ensured that the alternatives provided were very accurate.
Next Steps
- Test the Live Algorithm: How valuable are alternatives to the marked errors.
- How fast will this improve the algorithm?
- How are we pricing transcripts?
What is Tucan Summarzing Platform?
- Dashboard - Overview of currently transcribing audio files with direct access to export.
- Uploading - Users can upload multiple audio files and receive notifications via email once processing is complete. We support various file types such as FLAC, WAV, MP3, AAC, and M4A.
- Editor - Edit and listen in one place, while correcting with suggestions.
- Export - Export your transcripts to most available format types.
- Team - Easily invite all your team members
Branding
Creating a new transcription platform required me to consider the platform's look and feel, as well as how the visual language would align with its purpose of providing a simple yet enjoyable experience.
Logo
Name
The toucan is something we all see as very friendly, social, colorful and exotic animal that lives in the jungle among other rare animals. Our mission is to create a transcription platform that effectively handles all languages and dialects.
The brand name 'Toucan' was too complex, so we opted to simplify it to 'Tucan'.
Colors
The bird's beak's main color, orange, is vibrant and energetic, dictating that the brand’s color palette should also be playful and meaningful.
Typography
'Basier Circle' is the font we chose for all three products. Its simplicity and sharp lettering have always been well-received, aligning with one of our core values in creating this platform.
Challenges & Learnings
Validating Core Features
We ideated an algorithm based on features that were not yet technically built. This made it difficult to gather feedback. I developed a complex prototype which helped users to envision themselves in real-life scenarios. Once the first version is further user tests can evolve more insights.
Building a Holistic Platform Rapidly for a New Target Audience and New Business Case
Defining and understanding the design of a complete platform from scratch was a challenge I had solved multiple times before. The challenge with Tucan was the need to switch directions within months. Through extensive research and workshops, our team was able to rapidly change the course of development.
Conclusion
Since joining RecTag in April 2019, I have learned to work in a fast paced environment, navigate two big course changes and adapt. My problem solving skills have increased, being the sole designer and researcher. I learned to make decisions based on user goals as well as business objectives.
Talk about your project
I’d love to hear what you think. You can reach me at Linkedin, or by emailing hi [at] alexander-michaelis [dot] com
Best of all, request a free 30 min product feedback session with me and talk about your project.