HK1192399A - Systems and methods to present voice message information to a user of a computing device - Google Patents
Systems and methods to present voice message information to a user of a computing device Download PDFInfo
- Publication number
- HK1192399A HK1192399A HK14105435.6A HK14105435A HK1192399A HK 1192399 A HK1192399 A HK 1192399A HK 14105435 A HK14105435 A HK 14105435A HK 1192399 A HK1192399 A HK 1192399A
- Authority
- HK
- Hong Kong
- Prior art keywords
- user
- mobile device
- data
- message
- voice message
- Prior art date
Links
Abstract
Systems and methods to process and/or present information relating to voice messages for a user that are received from other persons. In one embodiment, a method implemented in a data processing system includes: receiving first data associated with prior communications or activities for a first user on a mobile device; receiving a voice message for the first user; transcribing the voice message using the first data to provide a transcribed message; and sending the transcribed message to the mobile device for display to the user.
Description
Cross reference to related applications
The present application claims priority from U.S. provisional patent application No.61/449,643, filed on day 21, 2011 and U.S. non-provisional patent application No.13/528,693, filed on day 20, 2012, both filed by j.bonforte and entitled "system and method for presenting voice message information to a user of a computing device," the entire contents of which are incorporated herein by reference.
This application relates to U.S. patent application No.12/792,698 entitled "auto-fill address book" filed on 2.6.2010 by Smith et al (also published on 2.12.2010 as U.S. patent publication No. 2010/0306185), the entire contents of which application is incorporated herein by reference.
Technical Field
At least some embodiments disclosed herein relate generally to information processing systems and, more particularly, but not by way of limitation, to processing and/or presenting information of a user of a computing device that is received from others (e.g., people who have called the user) related to voice messages.
Background
Users of mobile devices (e.g., Android and iPhone devices) typically receive voice messages from others (e.g., friends or business partners). When the user of the mobile device is unavailable (available), the caller typically leaves a voice message. In many cases, a user may have many voice messages to view, and may wish to take subsequent action after viewing one or more of the voice messages.
Disclosure of Invention
Systems and methods for processing and/or presenting voice messages received from others for a user are described herein. Some of the embodiments are summarized in this section.
In one embodiment, a method comprises: receiving, on a mobile device of a first user, first data associated with a previous communication or activity of the first user; receiving, by a computing device, a voice message of the first user; transcribing, by the computing device, a voice message using the first data to provide a transcribed message; and sending the transcribed message to the mobile device for display to a user.
In another embodiment, a method includes causing a mobile device of a first user to perform: sending, using the mobile device, first data to a computing apparatus, wherein the first data is associated with previous communications or activities of the first user on the first user's mobile device; sending, using the mobile device, a voice message of the first user to the computing apparatus; and receiving at the mobile device a transcribed message from the computing apparatus, wherein the computing apparatus has transcribed the voice message using the first data to create the transcribed message.
The present disclosure includes methods and apparatus to perform these methods, including data processing systems to perform these methods and computer media containing instructions to: when executed on a data processing system, the instructions cause the system to perform the methods.
Other features will be apparent from the accompanying drawings and from the detailed description that follows.
Drawings
Embodiments are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like references indicate similar elements.
FIG. 1 illustrates an example of a display screen provided to a user of a mobile device for viewing voice messages, according to one embodiment.
Fig. 2 shows an example of a display screen with some options for user selection to correct misspelled words in the voice message of fig. 1.
Fig. 3 illustrates an example of a display of a list of voice messages displayed to a user waiting for the user to view, according to one embodiment.
FIG. 4 illustrates a system for presenting voice message information to a user of a computing device, according to one embodiment.
FIG. 5 illustrates a block diagram of a data processing system, which can be used in various embodiments.
Fig. 6 shows a block diagram of a user equipment according to an embodiment.
Detailed Description
The following description and drawings are illustrative and are not to be construed as limiting. Numerous specific details are described to provide a thorough understanding. However, in certain instances, well-known or conventional details are not described in order to avoid obscuring the description. References to one or more embodiments in the present disclosure are not necessarily to the same embodiment, and such embodiments mean at least one.
Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily referring to other embodiments. Furthermore, many of the features described can be exhibited by some embodiments and not by others. Similarly, many of the requirements described may be met with respect to some embodiments but not others.
In one embodiment, a computing device (e.g., a mobile device) held by a user stores (e.g., in a database in the form of a personal profile) data associated with previous communications and/or other activities of the user on the mobile device (e.g., data extracted from previous emails received by the user). The caller calls the mobile device and leaves a voice message for the user. The caller is identified (e.g., using caller ID). A subset of social (social) data and/or other data related to the caller is retrieved from the user's database (e.g., the caller's personal profile and/or a predetermined number of recent emails sent by the caller to the person). The subset of data is used by the speech recognition system to transcribe the voice message. The transcribed voice message is provided to the user on the display screen of the mobile device.
In another embodiment, the user is also presented with a list of people and/or emails or other communications mentioned in the transcribed message. For example, the personal profiles (or links to the personal profiles) of two friends mentioned in the transcribed message may be displayed to the user in the same screen or page as the transcribed information. Likewise, links to emails mentioned by the caller in the transcribed message may be displayed on the same page or on another page (e.g., accessible through a link or icon on the page with the transcribed message).
Many examples of types of data (e.g., personal profiles of callers associated with a user) may be collected for a user in such a database (or in other forms in a data repository), such examples being described in U.S. patent application No.12/792,698, which has been incorporated by reference above.
In one embodiment, a user's mobile device stores data (e.g., in a database in the form of a personal profile) associated with previous communications and/or other activities of the user on the mobile device (e.g., data extracted from one or more of previous communications, such as emails or text messages, or other files or information received by the user from friends of the user or others, such as work partners). Other activities may include the manner or method in which the user operates the mobile device (e.g., what buttons or functions the user activated when previously interacting with the caller, what online services the user used when previously interacting with the caller, etc.).
The caller calls the mobile device and leaves a voice message for the user. The caller is identified (e.g., using caller ID). A subset of social and/or other data associated with the caller is retrieved from the user's database (e.g., the caller's personal profile and/or a predetermined number of recent emails sent by the caller to the person). In one embodiment, the subset of data and the caller's identification are sent to a speech-to-text service (e.g., an online service) along with a voice message to be transcribed. The subset of data is used by the speech recognition system to transcribe the voice message. The transcribed message is provided to the user on a display screen of the mobile device.
FIG. 1 shows an example of a display screen 100 provided to a user of a mobile device to view voice messages (e.g., messages from Amy Bonforte), according to one embodiment. A voice message from Amy (left for a user who Amy was previously unavailable or unaware of Amy's call) has been transcribed as described above, and a recording (transcription)102 is presented under Amy's name.
While the voice message is being viewed, the visual indicator 106 indicates the progress of the playing of the message. In addition, a visual cursor 114 indicates the position of the word being heard by the user in the transcribed message during play.
The recording 102 is generated by the speech recognition system using a subset of the social data that was sent to the users of the system prior to the recording. The subset of data is collected after the voice message from Amy has been recorded (e.g., by a server associated with the mobile device). The subset may include a personal profile of Amy (including the correct spelling of the name of Amy), the most recent email sent by Amy to the user, and others common to Amy and callers (e.g., others who have been cc'd by the mail between Amy and the user).
The voice recognition system transcribes the particular voice message using a subset of the social data. As other voice messages arrive for the user and need to be recorded, a new, different subset of data is selected and sent to the voice recognition system for transcription of the corresponding voice message transcription. Thus, in one embodiment, each subset of data may be unique to each voice message to be transcribed, although this is not required. Each subset of data may be sent to a speech recognition service from a server associated with a mobile phone storing an implicit social graph of the user, or may be sent directly from the mobile device.
Using the caller name data provided to the transcription service, the caller's name 108 ("Amy") is correctly transcribed. Two friends 110(Terra and mate), although not previously aware of the transcription service, are correctly spelled for use in the recording using a subset of the data provided from the user's social data.
Email 112 (which may be other forms of prior communication) is mentioned in the transcribed message. Through the use of the word "email", the system is triggered to select the email from the caller to the user that is most relevant to the message using correlation or other matching techniques (e.g., through the correlation of the words in the message with the words in the previous email and/or through the temporal correlation that has passed after the previous email was sent to the user; while a correlation-based ranking (rank) system may be used). The email or emails (and other relevant information referenced in the message) selected as most relevant are presented in the list 100.
Links 104 may also include contacts, personal profiles, or other information to people (e.g., Terra and mate) that have been referenced in the transcribed message, and these links may be presented to the user on the display of the mobile device. Links to individuals and emails allow the user to click on link 104 to initiate an action to contact the appropriate person by phone or email.
Fig. 2 shows an example of a display screen for some options 200 selected by a user to correct misspelled words in the voice message of fig. 1, according to one embodiment. When the user views the transcribed message, the user can select a word such as "mate" to provide a more positive spelling. The option 200 presented to the user for correction is selected at least in part from a subset of the data sent to the transcription service for transcribing the voice message. If the user selects a different spelling, the speech recognition system stores the correction and uses it in future transcriptions (e.g., of future voice messages from Amy and even other callers to the user) to improve accuracy.
Fig. 3 illustrates an example of displaying a list 300 of voice messages waiting to be viewed by a user to the user, according to one embodiment. For example, the voice message 302 has been transcribed using a subset of the user's social graph data from the user's social database (e.g., stored on a server associated with the mobile device). In one embodiment, the list 300 may be presented in a ranked order based on relevance to the user by ranking the people associated with each voice message. For example, the ranking may be made as a contact ranking as described in U.S. patent application No.12/792,698, which is incorporated by reference above.
Other specific, non-limiting embodiments of transcription and presentation of voice messages are discussed below. In a first example, the above method is provided by a telecommunications carrier to a mobile phone subscriber (e.g. an Apple iPhone or Android handset subscriber using a voice mail system) to improve transcription services for voice messages. Telecommunications carriers can leverage human profiles and/or other implicit social graph data to improve their voicemail services. When a user receives a voicemail from a caller, the caller ID information may be used to identify the caller. This identification (and optionally, along with other information and/or predefined criteria) is used to select a subset of data from the social graph data to send to a transcription service (e.g., a service typically used by an operator).
In a further example, when a voice message is left, the subset of the transmitted data includes the name of the person being called, as well as the names and other information of people that both the user and caller know (and therefore may appear in the voice message). Relevance ratings for these people may also be provided. The subset of data becomes part of the voice message metadata. Thus, the accuracy of names and other information in the transcribed message is improved when the voice message passes speech recognition. Thus, context associated with the user is provided to the speech recognition system to better account for words in the transcribed message.
As shown in FIG. 1 above, the names of friends are typically included in voice messages, but recognizing such names by a voice processing system is difficult because the system has not previously learned the names. However, the methods described above may result in providing the correct spelling of these names in the transcribed message. The name (or personal profile) of the person selected for the subset of social data may be limited to a predetermined number (e.g., 10 names).
The subset of data is sent to an online service on the internet that performs the speech to text conversion. It takes recorded voicemail messages provided by the handset or carrier and performs the transcription. The subset of data may be provided to the transcription server in a server-to-server manner or may be provided by the user's smartphone. The online service may generate a transcription and send the result back as a text message or web page.
As previously described, a reference in a message to a previous communication (e.g., "I just sent you an email") may be used as a trigger for selecting certain types of information related to the previous communication. For example, the subset of data may include all recent emails from and/or to the caller (and may include subject lines of those emails) to facilitate transcription of real or other information included in the voice message (e.g., the name of the player or concert that the recognition system may not know, but may be included in the previous email). Text or other data used in or associated with recent emails may significantly improve the ability to transcribe the word or other words in the transcribed message, and thus improve accuracy.
In one example, the user interface allows the user to correct the transcribed message, as described above. If a word is misspelled, the user can simply tap (tap) on the word and briefly slide his finger down (hold down) on the screen. A list of relevant options from which the user may select then appear (e.g., these options may be other similar synonyms from others in the user's social graph from which the system may select when performing the transcription, such as the names of others who sound like "mate"). This also improves the speech recognition system remembering clipping and correction, which speech patterns are then more correctly mapped in future transcriptions.
In another example, if the transcribed message references a previous email (e.g., "I sent you an email in ten minutes"), the subset of data may include people provided as an additional set of people who were carbon copied in the previous email in the past 10 or 30 minutes or other period of time (whether or not the people are highly relevant to the user), thereby providing additional information to the speech recognition system.
In another example, emails to the user often include an introduction to a new person (e.g., "he Jeff, which is David, i just send you an email to introduce Jacob, he is the originator of an emerging company that i want to talk to you, do you give Jacob a phone"). Such an introduction is typically followed by a telephone call. Previous emails are sent in a subset of the data and the speech recognition system has improved the accuracy of names that have not been encountered previously by the processing system. The subset of data may also include human information about the emerging company from the user's database, enabling that particular transcription to be made more accurately.
In one example, a voicemail message is displayed to a user with context (e.g., email and contacts) that are believed to be referenced in the voicemail message.
In a further example, based on the caller ID (from a mobile device or server that has previously seen the caller's phone number), the subset of data may include a small subset of the user's implicit curve, which is sent to the speech recognition system. In one example, a voice message may be destined for both the voice recognition system and the user's phone. The user's smartphone may perform some processing, but the service that performs voice message reception and processing may perform some or all of the processing.
For example, in the server-to-server case, the operator sends a voice message to the server for transcription, but first pings the server associated with the user's mobile device (and stores the user's social graph) to indicate that the user received a voicemail from a particular phone number. The server creates a subset of social data (including people, phone numbers, etc.) around the phone number, which may have been referenced as metadata. The transcription is sent back to the operator, and the operator sends the transcription to the mobile device.
In one example, a subset of the data is strongly targeted and highly tailored to a specific example. The subset of data is also an implicit map (obtained by simply observing the user's previous communication habits). It need not be explicitly maintained as with the previous directory graph.
Fig. 4 illustrates a system for presenting voice message information to a user of a computing device, such as a mobile device 150, for example, an iPhone device, according to one embodiment. In fig. 4, a user terminal (e.g., 141, 143, 145) and/or a mobile device including mobile device 150 is used to access server 123 over communications network 121.
Server 123 may include one or more web servers (or other types of data communication servers) to communicate with user terminals (e.g., 141, 143.
The server 123 may be connected to a data storage facility to store user-provided content, such as multimedia content, navigation data, preference data, and the like. The server 123 may also store the personal profile 154 or access the stored personal profile 154.
Personal profile 154 may be created and updated based on email or other communications to and from mobile device 150 and other mobile devices of various users. In an alternative embodiment, the personal profile 152 may be stored in a memory of the mobile device 150. During operation, mobile device 150 may access and use a personal profile obtained locally from mobile device 150 or from server 123 via communication network 121.
When a voice message sent to or located at the user's mobile device 150 is received, one or more personal profiles and/or data described herein may be sent with the voice message over the communication network 121 to the voice recognition system 160 to be transcribed as discussed herein.
System 160 may store personal profile 162, which may include profiles received from mobile device 150 and/or server 123. Personal profile 162 may also be received from other computing devices not shown in fig. 4.
While fig. 4 illustrates an implementation of an exemplary system in a client-server architecture, embodiments of the present disclosure may be implemented in a variety of alternative architectures. For example, the system may be implemented by a user terminal over a peer-to-peer network, where content and data are shared over a peer-to-peer communication connection.
In some embodiments, a combination of a client server architecture and a peer-to-peer architecture may be used, where one or more central servers may be used to provide some information and/or services and a peer-to-peer network is used to provide other information and/or services. Accordingly, embodiments of the present disclosure are not limited to a particular structure.
FIG. 5 illustrates a block diagram of a data processing system that may be used in various embodiments (e.g., to implement server 123 or speech recognition system 160). Although FIG. 5 illustrates various components of a computer system, it is not intended to represent any particular structure or manner of interconnecting the components. Other systems with more or fewer components may also be used.
In fig. 5, a system 201 includes an interconnect-connect 202 (e.g., a bus and system core logic) that connects a microprocessor(s) 203 and a memory 208. In the example of fig. 5, the microprocessor 203 is coupled to a cache memory 204.
The interconnect fabric 202 may connect the microprocessor(s) 203 and memory 208 to the epidemic and also connect them to a display controller and display device 207, as well as peripheral devices such as input/output (I/O) devices 205 through an input/output (I/O) controller 206. Typical I/O devices include mice, keyboards, modems, network interfaces, printers, scanners, video cameras and other devices known in the art.
Interconnect fabric 202 may include one or more buses connected to one another through various bridges, controllers and/or adapters. In one embodiment, I/O controller 206 includes a USB adapter for controlling USB (Universal Serial bus) peripheral devices and/or an IEEE-1394 bus adapter for controlling IEEE-1394 compliant peripheral devices.
The memory 208 may include ROM (read only memory), as well as volatile RAM (random access memory) and non-volatile memory, such as a hard disk drive, flash memory, and the like.
Volatile RAM is typically implemented as dynamic RAM (dram), which requires uninterrupted power to refresh or maintain the data in the memory. The non-volatile memory is typically a magnetic hard drive, a magneto-optical drive, or an optical drive (e.g., DVD RAM) or other type of storage system that retains data even after power is removed from the system. The non-volatile memory may also be a random access memory.
The non-volatile memory may be a local device coupled directly to the rest of the components in the data processing system. Non-volatile storage remote from the system, such as a network storage device coupled to the data processing system through a network interface such as a modem or Ethernet interface, may also be used. In one embodiment, a data processing system such as that shown in FIG. 5 is used to implement a server or speech recognition system, and/or other servers.
In one embodiment, a data processing system as shown in FIG. 5 is used to implement a user terminal. The user terminal may be in the form of a Personal Digital Assistant (PDA), a cellular telephone, a notebook computer or a personal desktop computer.
In some embodiments, one or more servers of the system may be replaced with a peer-to-peer network service of multiple data processing systems, or a network of distributed computing systems. A point-to-point network or distributed computing system may be collectively referred to as a server data processing system.
Embodiments of the invention may be implemented by the microprocessor(s) 203 and/or the memory 208. For example, the functions described may be implemented in part by hardware logic in the microprocessor(s) 203 and in part using instructions stored in the memory 208. Some embodiments are implemented using microprocessor(s) 203 without the instructions stored in memory 208. Some embodiments are used with instructions stored in memory 208 for execution by one or more general purpose microprocessors 203. Accordingly, the present disclosure is not limited to a particular configuration of hardware and/or software.
Fig. 6 shows a block diagram of a user device, such as mobile device 150, according to one embodiment. In fig. 6, the user device comprises an interconnect structure 221 for connecting the rendering device, a user input device 231, a processor 233, a memory 227, a location identification unit 225 and a communication device 223.
In fig. 6, location identification unit 225 is used to identify the geographic location of user content created for sharing. The location identification unit 225 may include a satellite positioning system receiver (e.g., a Global Positioning System (GPS) receiver) to automatically determine the current location of the user device.
In fig. 6, the communication device 223 is configured to communicate with a server and/or a speech recognition system. In one embodiment, the user input device 231 is configured to generate user data content. The user input device 231 may include a text input device, a still image camera, a video camera, and/or a voice recorder, among others.
Various other embodiments are now described. In one embodiment, a method comprises: receiving, on a mobile device of a first user, first data associated with a previous communication or activity of the first user; receiving, by a computing device, a voice message of the first user; transcribing, by the computing device, a voice message using the first data to provide a transcribed message; and sending the transcribed message to the mobile device for display to a user.
In one embodiment, the first data includes at least one personal profile, including a personal profile of a caller who created the voice message. In one embodiment, the voice message is created by a caller and the first data comprises a predetermined number of most recent messages sent by the caller to the first user.
In one embodiment, the first data includes a plurality of personal profiles including personal profiles of persons other than the first user referenced in the voice message. In one embodiment, a voice message and first data are received from the mobile device.
The first data may be received from a server, and the server may store a plurality of personal profiles of users of the mobile device (including the first user). The transcription may be performed using a speech recognition system.
In one embodiment, the method further comprises: a list of people or messages is sent to the mobile device for display to the first user, each person or message in the list being referenced in the transcribed message. In one embodiment, the first data is associated with a previous activity of the first user (including a method of operation of the mobile device).
In one embodiment, the method further comprises: a link is sent to the mobile device to the email referenced in the transcribed message. The voice message may be created by a caller, and the method may further comprise: sending to the mobile device a personal profile of at least one person other than the caller referenced in the transcribed message.
In one embodiment, a non-transitory computer-readable storage medium stores computer-readable instructions that, when executed, cause a mobile device of a first user to perform: sending, using the mobile device, first data to a computing apparatus, wherein the first data is associated with previous communications or activities of the first user on the first user's mobile device; sending, using the mobile device, a voice message of the first user to the computing apparatus; and receiving at the mobile device a transcribed message from the computing apparatus, wherein the computing apparatus has transcribed the voice message using the first data to create the transcribed message.
In one embodiment, the first data comprises a plurality of personal profiles, including a personal profile of a person other than the first user referenced in the voice message, and the instructions further cause the mobile device to perform: storing the plurality of personal profiles in a memory of the mobile device. In one embodiment, the instructions further cause the mobile device to perform: transmitting the personal profile to a server other than the computing device, wherein the server is configured to store a plurality of personal profiles of users of the mobile device, including the first user.
The computing device may be a speech recognition system. The instructions may also cause the mobile device to receive a personal profile of the person referenced in the transcribed message. The instructions also cause the mobile device to perform: presenting a list of people or messages to a user on a display screen of the mobile device, each person or message in the list being referenced in a transcribed message.
In one embodiment, a system comprises: at least one processor, and a memory storing instructions configured to instruct the at least one processor to: receiving, on a mobile device of a first user, first data associated with a previous communication or activity of the first user; transcribing, by the computing device, a voice message using the first data to provide a transcribed message; and sending the transcribed message to the mobile device for display to a user.
In one embodiment, the first data includes at least one personal profile, including a personal profile of a caller who created the voice message. In one embodiment, the first data is received from a server, and the server stores a plurality of personal profiles of users of the mobile device (including the first user).
In the specification, various functions and operations may be described as being performed or caused by software code for the purpose of simplifying the description. Those skilled in the art will recognize that such expressions mean that the functions result from execution of the code by a processor, such as a microprocessor. Alternatively, or in combination, the functions and operations may be implemented using special purpose circuits, with or without software instructions, such as an Application Specific Integrated Circuit (ASIC) or a Field Programmable Gate Array (FPGA). Embodiments may be implemented using hardwired circuitry, without software instructions, or in combination with software instructions. Thus, the techniques are not limited to any specific combination of hardware circuitry and software, nor to any particular source for the instructions executed by the data processing system.
While some embodiments may be implemented in fully functioning computers and computer systems, they may be implemented in a variety of ways as a computing product and may be applied regardless of the particular type of machine or computer readable media used to actually effect the distribution.
At least some aspects of the disclosure may be embodied, at least in part, in software. That is, the techniques may be performed in a computer system or other data processing system in response to its processor (e.g., a microprocessor) executing sequences of instructions contained in a memory (e.g., read only memory, volatile RAM, non-volatile memory, a cache, or a remote storage device).
The routines executed to implement the embodiments, may be implemented as part of an operating system, middleware, a service delivery platform, an SDK (software development kit) component, a web service or other specific application component, program, object, module, or sequence of instructions referred to as a "computer program". The call interfaces to these routines can be exposed to the software development community as APIs (application programming interfaces). The computer programs typically comprise one or more instructions that are arranged at various times in various memory and storage devices of the computer, and when read and executed by one or more processors in the computer, cause the computer to perform operations necessary to execute elements relating to the various aspects.
A machine-readable medium may be used to store software and data which, when executed by a data processing system, cause the system to perform various methods. Executable software and data may be stored in various locations including, for example, read-only memory, volatile RAM, non-volatile memory, and/or cache memory. A portion of the software and/or data may be stored on any of these storage devices. In addition, the data and instructions may be obtained from a central server or a peer-to-peer network. Different portions of the data and instructions may be obtained by different central servers and/or peer-to-peer networks at different times in different communication sessions or in the same communication session. The data and instructions may all be obtained prior to execution of the application. Alternatively, a portion of the data and instructions may be obtained dynamically at the time they need to be executed. Thus, in a specific example of time, it is not necessary that the data and instructions be entirely on the machine-readable medium.
Examples of computer readable media include, but are not limited to: recordable and non-recordable type media (e.g., volatile and non-volatile memory devices), Read Only Memory (ROM), Random Access Memory (RAM), flash memory devices, floppy and other removable disks, magnetic disk storage media, optical storage media (e.g., compact disk read only memory (CD ROMs), Digital Versatile Disks (DVDs), etc.), among others. A computer readable medium may store instructions.
The instructions may also be embodied in digital and analog communications links of electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.). Propagated signals, such as carrier waves, infrared signals, digital signals, etc., are not tangible machine-readable media and are not configured to store instructions.
In general, a tangible machine-readable medium includes any mechanism that provides (e.g., stores) information in a form accessible by a machine (e.g., a computer, network device, personal digital assistant, manufacturing tool, any device with one or more processors, etc.).
In various embodiments, hardwired circuitry may be used in combination with software instructions to implement techniques. Thus, the techniques are not limited to any specific combination of hardware circuitry and software nor to any particular source for the instructions executed by the data processing system.
Although some services show some operations in a particular order, operations that are not order dependent may be reordered and other operations may be combined or broken. While some reordering or other groupings are specifically mentioned, other operations will be apparent to those of ordinary skill in the art and thus a detailed list of alternatives is not presented. Further, it should be recognized that the stages may be implemented in hardware, firmware, software, or any combination thereof.
In the foregoing specification, the disclosure has been described with reference to specific exemplary embodiments thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope as set forth in the following claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Claims (20)
1. A method, comprising:
receiving, on a mobile device of a first user, first data associated with a previous communication or activity of the first user;
receiving, by a computing device, a voice message of the first user;
transcribing, by the computing device, the voice message using the first data to provide a transcribed message; and
sending the transcribed message to the mobile device for display to the user.
2. The method of claim 1, wherein the first data comprises at least one personal profile, the at least one personal profile comprising a personal profile of a caller creating the voice message.
3. The method of claim 1, wherein the voice message is created by a caller and the first data comprises a predetermined number of most recent messages sent by the caller to the first user.
4. The method of claim 1, wherein the first data comprises a plurality of personal profiles including personal profiles of persons other than the first user mentioned in the voice message.
5. The method of claim 1, wherein the voice message and the first data are received from the mobile device.
6. The method of claim 1, wherein the first data is received from a server, and the server stores a plurality of personal profiles for a plurality of users of the mobile device including the first user.
7. The method of claim 1, wherein the transcribing is performed using a speech recognition system.
8. The method of claim 1, further comprising: sending a list of people or messages to the mobile device for display to the first user, each person or message in the list being mentioned in the transcribed message.
9. The method of claim 1, wherein the first data is associated with previous activities of the first user including a mode of operation of the mobile device.
10. The method of claim 1, further comprising: sending a link to the mobile device to an email mentioned in the transcribed message.
11. The method of claim 1, wherein the voice message is created by a caller, and further comprising: sending to the mobile device a personal profile of at least one person other than the caller mentioned in the transcribed message.
12. A non-transitory computer-readable storage medium storing computer-readable instructions that, when executed, cause a mobile device of a first user to:
sending, using the mobile device, first data to a computing apparatus, wherein the first data is associated with previous communications or activities of the first user on the mobile device;
sending, using the mobile device, a voice message of the first user to the computing apparatus; and is
Receiving, at the mobile device, a transcribed message from the computing apparatus, wherein the computing apparatus has transcribed the voice message using the first data to create the transcribed message.
13. The storage medium of claim 12, wherein the first data comprises a plurality of personal profiles, including personal profiles of persons other than the first user referenced in the voice message, and the instructions further cause the mobile device to store the plurality of personal profiles in a memory of the mobile device.
14. The storage medium of claim 12, wherein the instructions further cause the mobile device to transmit the personal profile to a server other than the computing apparatus, wherein the server is configured to store a plurality of personal profiles for a plurality of users of the mobile device including the first user.
15. The storage medium of claim 12, wherein the computing device is a speech recognition system.
16. The storage medium of claim 12, wherein the instructions further cause the mobile device to receive a personal profile of a person mentioned in the transcribed message.
17. The storage medium of claim 12, wherein the instructions further cause the mobile device to present a list of people or messages to the first user on a display screen of the mobile device, each person or message in the list being mentioned in the transcribed message.
18. A system, comprising:
at least one processor, and
a memory storing instructions configured to cause the at least one processor to:
receiving, on a mobile device of a first user, first data associated with a previous communication or activity of the first user;
receiving a voice message of the first user;
transcribing the voice message using the first data to provide a transcribed message; and
sending the transcribed message to the mobile device for display to a user.
19. The system of claim 18, wherein the first data comprises at least one personal profile including a personal profile of a caller who created the voice message.
20. The system of claim 18, wherein the first data is received from a server, and the server stores a plurality of personal profiles for a plurality of users of the mobile device including the first user.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US61/499,643 | 2011-06-21 | ||
| US13/528,693 | 2012-06-20 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1192399A true HK1192399A (en) | 2014-08-15 |
| HK1192399B HK1192399B (en) | 2019-11-15 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11349991B2 (en) | Systems and methods to present voice message information to a user of a computing device | |
| US10678503B2 (en) | System and method for connecting to addresses received in spoken communications | |
| US9398128B2 (en) | Identifying a contact based on a voice communication session | |
| US20100246784A1 (en) | Conversation support | |
| US9313329B2 (en) | Voice response systems browsing | |
| CN103620635A (en) | Presenting top contact information to a user of a computing device | |
| US20120259633A1 (en) | Audio-interactive message exchange | |
| US20160098995A1 (en) | Speech to text training method and system | |
| US10257350B2 (en) | Playing back portions of a recorded conversation based on keywords | |
| CN114760387A (en) | Method and device for managing maintenance | |
| CN117882365A (en) | Verbal menu for determining and visually displaying calls | |
| US8958775B2 (en) | Aggregating voicemail messages from multiple sources | |
| US8340640B2 (en) | Transcription systems and methods | |
| WO2014001453A1 (en) | System and method to analyze voice communications | |
| HK1192399A (en) | Systems and methods to present voice message information to a user of a computing device | |
| HK1192399B (en) | Systems and methods to present voice message information to a user of a computing device | |
| US11962715B2 (en) | Telephone call information collection and retrieval | |
| HK1162783A (en) | Conversation support |