{"id":37720,"date":"2017-02-23T16:46:12","date_gmt":"2017-02-23T15:46:12","guid":{"rendered":"https:\/\/www.clickworker.com\/?page_id=37720\/"},"modified":"2023-02-24T16:21:40","modified_gmt":"2023-02-24T15:21:40","slug":"training-data-for-machine-learning-of-speech-recognition-systems","status":"publish","type":"page","link":"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/","title":{"rendered":"Training data for machine learning of speech recognition systems – Case Study"},"content":{"rendered":"
\r\n\t
\r\n\t
\r\n\t
\r\n\t

Speech recognition training data for software development<\/h1>\r\n\t <\/div>\r\n\t <\/div>\r\n\t
\r\n\t
\r\n\t \"Speech\r\n\t <\/div>\r\n\t
\r\n\t \r\n\t

Case study \u2013 Creation and analysis of voice recordings as training data for speech recognition software<\/strong><\/p>\r\n\t

Thousands of Clickworkers record voice commands which are used to control car infotainment systems. These are then transcribed and analyzed, providing the manufacturer with significant speech recognition training data, which is needed to program and optimize their speech recognition software.\r\n\t <\/p>\r\n\r\n\t <\/i> Get in touch with us!<\/a>\r\n\t\t\t\t\t\t\t\r\n\t\t\t\t\t\t\t<\/i> +1 (212) 878-6686<\/a>\r\n\t <\/i> +49 201 95971830<\/a>\r\n\t \r\n\r\n\t <\/div>\r\n\t <\/div>\r\n\t <\/div>\r\n\t<\/section>\r\n\t\r\n\r\n\r\n\t

<\/div>\r\n\r\n\r\n\t
\r\n\t
\r\n\t
\r\n\t
\r\n\t \"Speech\r\n\t <\/div>\r\n\t
\r\n\t

The challenge for speech recognition training data<\/h2>\r\n\t

Voice control systems are only as good as their speech recognition. The biggest challenge is\r\n\t optimizing and training these speech recognition systems to react to the large variety of voice\r\n\t commands.<\/p>\r\n\r\n\t

\r\n\t \r\n\t Programming that does not include \u201chuman reason\u201d and \u201chuman behavior\u201d factors cannot lead to an\r\n\t ideal speech recognition system. In many cases, the users\u2019 voice commands are not recognized,\r\n\t or they are misunderstood.<\/strong>\r\n\t <\/p>\r\n\t

The users must often enter their commands several times before the system reacts to the entry\r\n\t correctly and displays the desired information. This is time-consuming for the user and distracting\r\n\t while driving.<\/p>\r\n\t

Speech recordings of thousands of different people with their individual commands and pronunciations\r\n\t are needed to optimize the range of the system for it to be able to recognize the individual voice\r\n\t commands of potential users.<\/p>\r\n\t <\/div>\r\n\r\n\t <\/div>\r\n\r\n\t

<\/div>\r\n\r\n\t
\r\n\t
\r\n\t

The solution: creating audio datasets<\/a> to improve speech recognition software<\/h2>\r\n\t

Thousands of our Clickworkers from different countries and regions record how they would issue a\r\n\t command, to call up the predefined reaction x, or information y, via the infotainment system. Every voice recording differs \u2013 even in the same language \u2013 due to the individual choice of words, the word order as well as every single Clickworker\u2019s specific pronunciation.\r\n\t <\/p>\r\n\t

To optimize the speech recognition software algorithms, they must also be trained to react to\r\n\t certain cues such as keywords. In a second step, our Clickworkers transcribe all the voice\r\n\t recordings and analyze these sentences to identify the keywords used and their frequency.<\/p>\r\n\t

With the help of these recordings, manufacturers train their speech recognition software and\r\n\t optimize the infotainment system to respond to the individually different ways users handle the\r\n\t system.<\/p>\r\n\t <\/div>\r\n\t

\r\n\t \"Speech\r\n\t <\/div>\r\n\t <\/div>\r\n\r\n\t <\/div>\r\n\t<\/section>\r\n\r\n\r\n\t
\r\n\t
\r\n\r\n\t
\r\n\t
\r\n\t

Project Data<\/h2>\r\n\t <\/div>\r\n\t <\/div>\r\n\t
\r\n\r\n\r\n\t
\r\n\t
\r\n\t
\r\n\t
\r\n\r\n\t
    \r\n\t
  • Clickworker qualifications:<\/strong> Native speakers from the target\r\n\t regions<\/li>\r\n\t
  • Languages:<\/strong> 9 languages<\/li>\r\n\t
  • Number of voice recordings (in MP4-Format):<\/strong> 810,000 (600\r\n\t recordings per language for 150 scenarios)<\/li>\r\n\t
  • Tasks:<\/strong>\r\n\t
      \r\n\t
    • 1. Task: Create the audio recording<\/li>\r\n\t
    • 2. Task: Transcribe the recordings<\/li>\r\n\t
    • 3. Task: Analyze and evaluate the recordings<\/li>\r\n\t <\/ul>\r\n\t <\/li>\r\n\t
    • Quality assurance: <\/strong> a second Clickworker, the transcriber, checks\r\n\t the quality of the recordings<\/li>\r\n\t
    • Data transfer: <\/strong> Data transfer via xls-file<\/li>\r\n\t <\/ul>\r\n\t <\/div>\r\n\t\t\t\t\t\t\t\r\n\t\t\t\t\t\t\t
      \r\n\t\t\t\t\t\t\t\t\"Face\r\n\t\t\t\t\t\t\t<\/div>\t\t\t\t\t\t\t\r\n\r\n\t <\/div>\r\n\t <\/div>\r\n\t <\/div>\r\n\r\n\t <\/div>\r\n\r\n\t
      <\/div>\r\n\r\n\t
      \r\n\t
      \r\n\t

      Workflow<\/h2>\r\n\t
        \r\n\t
      1. The project is discussed with the customer and the tasks are defined accordingly.\r\n\t <\/li>\r\n\t
      2. clickworker sets up the project in a three-stage distribution of tasks, including\r\n\t briefings for the Clickworkers and quality assurance.\r\n\r\n\t
          \r\n\t
        • 1. Task:<\/strong> Creation of voice recordings\r\n\t\t\t\t\t\t\t\t\t
            \r\n\t\t\t\t\t\t\t\t\t\t
          • Audio recordings in 9 languages<\/li>\r\n\t\t\t\t\t\t\t\t\t\t
          • 600 recordings per language for 150 scenarios<\/li>\r\n\t\t\t\t\t\t\t\t\t\t
          • 1,200 Clickworkers per language are requested<\/li>\r\n\t\t\t\t\t\t\t\t\t\t
          • Audio format: MP4-files<\/li>\r\n\t\t\t\t\t\t\t\t\t<\/ul>\r\n\t\t\t\t\t\t\t\t<\/li>\r\n\t
          • 2. Task:<\/strong> Quality assurance and transcription\r\n\t\t\t\t\t\t\t\t\t
              \r\n\t\t\t\t\t\t\t\t\t\t
            • Checking and transcription of the 810,000 voice recordings made by native speakers<\/li>\r\n\t\t\t\t\t\t\t\t\t<\/ul>\t\t\t\t\t\t\t\t\r\n\t\t\t\t\t\t\t\t<\/li>\r\n\t
            • 3. Task:<\/strong> Analysis and evaluation\r\n\t\t\t\t\t\t\t\t\t
                \r\n\t\t\t\t\t\t\t\t\t\t
              • Calculation of the keywords and their frequency per scenario and language<\/li>\r\n\t\t\t\t\t\t\t\t\t\t
              • Filtering the phrases incl. frequency per scenario and language<\/li>\r\n\t\t\t\t\t\t\t\t\t<\/ul>\t\t\t\t\t\t\t\t\r\n\t\t\t\t\t\t\t\t<\/li>\r\n\t <\/ul>\r\n\r\n\t <\/li>\r\n\t
              • The final task results are transferred to the customer via xls-file.<\/li>\r\n\r\n\t <\/ol>\r\n\t <\/div>\r\n\t <\/div>\r\n\r\n\t <\/div>\r\n\t<\/section>\r\n\r\n\t
                \r\n\t
                \r\n\t
                \r\n\t
                \r\n\t \"Speech\r\n\t <\/div>\r\n\t
                \r\n\t

                Benefits<\/h2>\r\n\t
                  \r\n\t
                • Speed<\/li>\r\n\t
                • Three services from a single source<\/li>\r\n\t
                • Simple access to know-how and language skills<\/li>\r\n\t
                • Quality assured results<\/li>\r\n\t
                • Scalable throughput<\/li>\r\n\t
                • Flexible workforce<\/li>\r\n\r\n\t <\/ul>\r\n\t <\/div>\r\n\t <\/div>\r\n\t <\/div>\r\n\t<\/section>\r\n\r\n\t
                  \r\n\t
                  \r\n\t
                  \r\n\t
                  \r\n\t

                  The difficulties of speech recognition training data: machine\r\n\t learning and the human factor<\/h2>\r\n\t

                  Speech recognition offers many useful applications that can make day-to-day activities easier.\r\n\t Whether it is used to search for something online, unlock a smartphone, or operate a car\r\n\t infotainment system: More and more programs use voice recordings. This poses challenges to the\r\n\t software development. Since every person speaks differently based on their dialect, individual\r\n\t mannerisms, or potential speech impediments, the program needs to be trained to recognize the same\r\n\t words in various iterations. This is why the human factor plays such an important role in gathering\r\n\t speech recognition training data.<\/p>\r\n\r\n\t

                  \r\n\t Simply using one recording to train the system would not yield the desired results. Instead, we\r\n\t provide a multitude of different voice recordings that can help the machine learn. Once this\r\n\t foundation has been laid, the software can use the training data to come to the right conclusions\r\n\t and keep evolving.\r\n\t <\/p>\r\n\t <\/div>\r\n\t <\/div>\r\n\t <\/div>\r\n\r\n\t<\/section>","protected":false},"excerpt":{"rendered":"

                  Speech recognition training data for software development Case study \u2013 Creation and analysis of voice recordings as training data for speech recognition software Thousands of Clickworkers record voice commands which are used to control car infotainment systems. These are then transcribed and analyzed, providing the manufacturer with significant speech recognition training data, which is needed […]<\/p>\n","protected":false},"author":5,"featured_media":0,"parent":31877,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"blank_bootstrap.php","meta":{"footnotes":""},"yoast_head":"\nTraining data for machine learning of speech recognition systems<\/title>\n<meta name=\"description\" content=\"To optimize software for speech recognition, training data is collected by thousands of Clickworkers in various languages. Find out more!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Training data for machine learning of speech recognition systems\" \/>\n<meta property=\"og:description\" content=\"To optimize software for speech recognition, training data is collected by thousands of Clickworkers in various languages. Find out more!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/\" \/>\n<meta property=\"og:site_name\" content=\"clickworker.com\" \/>\n<meta property=\"article:modified_time\" content=\"2023-02-24T15:21:40+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/\",\"url\":\"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/\",\"name\":\"Training data for machine learning of speech recognition systems\",\"isPartOf\":{\"@id\":\"https:\/\/www.clickworker.com\/#website\"},\"datePublished\":\"2017-02-23T15:46:12+00:00\",\"dateModified\":\"2023-02-24T15:21:40+00:00\",\"description\":\"To optimize software for speech recognition, training data is collected by thousands of Clickworkers in various languages. Find out more!\",\"breadcrumb\":{\"@id\":\"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.clickworker.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Crowdsourcing Case Studies\",\"item\":\"https:\/\/www.clickworker.com\/case-studies\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Training data for machine learning of speech recognition systems – Case Study\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.clickworker.com\/#website\",\"url\":\"https:\/\/www.clickworker.com\/\",\"name\":\"clickworker.com\",\"description\":\"Your Content Provider\",\"publisher\":{\"@id\":\"https:\/\/www.clickworker.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.clickworker.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.clickworker.com\/#organization\",\"name\":\"clickworker\",\"url\":\"https:\/\/www.clickworker.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.clickworker.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.clickworker.com\/wp-content\/uploads\/2023\/06\/clickworkerCompactLogo.webp\",\"contentUrl\":\"https:\/\/www.clickworker.com\/wp-content\/uploads\/2023\/06\/clickworkerCompactLogo.webp\",\"width\":696,\"height\":696,\"caption\":\"clickworker\"},\"image\":{\"@id\":\"https:\/\/www.clickworker.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.linkedin.com\/company\/clickworker\/\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Training data for machine learning of speech recognition systems","description":"To optimize software for speech recognition, training data is collected by thousands of Clickworkers in various languages. Find out more!","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/","og_locale":"en_US","og_type":"article","og_title":"Training data for machine learning of speech recognition systems","og_description":"To optimize software for speech recognition, training data is collected by thousands of Clickworkers in various languages. Find out more!","og_url":"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/","og_site_name":"clickworker.com","article_modified_time":"2023-02-24T15:21:40+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/","url":"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/","name":"Training data for machine learning of speech recognition systems","isPartOf":{"@id":"https:\/\/www.clickworker.com\/#website"},"datePublished":"2017-02-23T15:46:12+00:00","dateModified":"2023-02-24T15:21:40+00:00","description":"To optimize software for speech recognition, training data is collected by thousands of Clickworkers in various languages. Find out more!","breadcrumb":{"@id":"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.clickworker.com\/case-studies\/training-data-for-machine-learning-of-speech-recognition-systems\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.clickworker.com\/"},{"@type":"ListItem","position":2,"name":"Crowdsourcing Case Studies","item":"https:\/\/www.clickworker.com\/case-studies\/"},{"@type":"ListItem","position":3,"name":"Training data for machine learning of speech recognition systems – Case Study"}]},{"@type":"WebSite","@id":"https:\/\/www.clickworker.com\/#website","url":"https:\/\/www.clickworker.com\/","name":"clickworker.com","description":"Your Content Provider","publisher":{"@id":"https:\/\/www.clickworker.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.clickworker.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.clickworker.com\/#organization","name":"clickworker","url":"https:\/\/www.clickworker.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.clickworker.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.clickworker.com\/wp-content\/uploads\/2023\/06\/clickworkerCompactLogo.webp","contentUrl":"https:\/\/www.clickworker.com\/wp-content\/uploads\/2023\/06\/clickworkerCompactLogo.webp","width":696,"height":696,"caption":"clickworker"},"image":{"@id":"https:\/\/www.clickworker.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.linkedin.com\/company\/clickworker\/"]}]}},"_links":{"self":[{"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/pages\/37720"}],"collection":[{"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/comments?post=37720"}],"version-history":[{"count":58,"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/pages\/37720\/revisions"}],"predecessor-version":[{"id":72245,"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/pages\/37720\/revisions\/72245"}],"up":[{"embeddable":true,"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/pages\/31877"}],"wp:attachment":[{"href":"https:\/\/www.clickworker.com\/wp-json\/wp\/v2\/media?parent=37720"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}