HK40015116B - Wireless coordination of audio sources - Google Patents
Wireless coordination of audio sources Download PDFInfo
- Publication number
- HK40015116B HK40015116B HK62020004599.0A HK62020004599A HK40015116B HK 40015116 B HK40015116 B HK 40015116B HK 62020004599 A HK62020004599 A HK 62020004599A HK 40015116 B HK40015116 B HK 40015116B
- Authority
- HK
- Hong Kong
- Prior art keywords
- playback
- electronic device
- hub
- time
- audio content
- Prior art date
Links
Description
Technical Field
The described embodiments relate to communication technology. More specifically, the described embodiments include communication techniques to wirelessly coordinate playback times of electronic devices that output sound.
Background
Music generally has a significant impact on an individual's mood and perception. This is thought to be the result of the association or relationship between the areas of the brain that decrypt, learn and remember music and the areas that produce emotional responses, such as the frontal lobe and the limbic system. In fact, it is believed that emotions are involved in the interpretation process of music, and that emotions are important in the influence of music on the brain. In view of the ability of music to "feel" a listener, audio quality is often an important factor in user satisfaction when a user listens to audio content, and more generally, when a user watches and listens to audio/video (a/V) content.
However, achieving high audio quality in an environment is often challenging. For example, the acoustic source (e.g., speaker) may not be properly placed in the environment. Alternatively or additionally, the listener may not be located at an ideal location in the environment. In particular, in stereo playback systems, the so-called "sweet spot", in which the amplitude and arrival time differences are small enough that both the apparent image and the localization of the original sound source are preserved, is generally limited to a rather small area between the loudspeakers. When the listener is outside the region, the image collapse is apparent and only one or the other independent audio channel output by the speaker may be heard. Furthermore, achieving high audio quality in the environment often places significant constraints on the synchronization of the speakers.
Thus, when one or more of these factors is suboptimal, the acoustic quality in the environment may be reduced. This, in turn, may adversely affect the listener's satisfaction and overall user experience when listening to audio content and/or a/V content.
Disclosure of Invention
A first set of described embodiments includes an audio/video (a/V) hub. The A/V hub includes: one or more antennas; and an interface circuit that, during operation, communicates with the electronic device using wireless communication. During operation, the a/V hub receives frames from the electronic devices via wireless communication, wherein a given frame includes a transmission time when the given electronic device transmits the given frame. The A/V hub then stores a receive time when the frame was received, where the receive time is based on a clock in the A/V hub. Further, the a/V hub calculates a current time offset between a clock in the electronic device and a clock in the a/V hub based on the reception time and the transmission time of the frame. Next, the a/V hub transmits one or more frames including the audio content and playback timing information to the electronic device, wherein the playback timing information specifies a playback time when the electronic device is to playback the audio content based on the current time offset. Furthermore, the playback times of the electronic devices have a temporal relationship, thereby coordinating the playback of audio content by the electronic devices.
Note that the temporal relationship may have non-zero values, instructing at least some electronic devices to playback audio content having a phase relative to each other by using different values of playback time. For example, the different playback times may be based on an acoustic characterization of the environment that includes the electronic device and the a/V hub. Alternatively or additionally, the different playback times may be based on desired acoustic characteristics in the environment.
In some embodiments, the electronic device is located at a vector distance from the a/V hub, and the interface circuit determines a magnitude of the vector distance based on the transmit time and the receive time using wireless ranging. Further, the interface circuit may determine an angle of the vector distance based on an angle of arrival of a wireless signal associated with a frame received by the one or more antennas during the wireless communication. Further, the different playback times may be based on the determined vector distance.
Alternatively or additionally, the different playback times are based on an estimated position of the listener relative to the electronic device. For example, the interface circuit may: communicating with another electronic device; and calculating an estimated location of the listener based on the communication with the other electronic device. Further, the a/V hub may include an acoustic wave sensor that performs acoustic measurements of the environment including the a/V hub, and the a/V hub may calculate an estimated location of the listener based on the acoustic measurements. Further, the interface circuit may communicate with other electronic devices in the environment and may receive additional sound measurements of the environment from the other electronic devices. In these embodiments, the a/V hub calculates the estimated location of the listener based on the additional sound measurements. In some embodiments, the interface circuit: performing a time-of-flight measurement; and calculates an estimated position of the listener based on the time-of-flight measurements.
Note that the electronic device may be located at a non-zero distance from the a/V hub, and may calculate the current time offset based on the transmission time and the reception time by ignoring the distance, using wireless ranging.
Further, the current time offset may be based on a model of clock drift in the electronic device.
Another embodiment provides a computer readable storage medium for use with an a/V hub. The computer readable storage medium includes program modules that, when executed by the a/V hub, cause the a/V hub to perform at least some of the operations described above.
Another embodiment provides a method for coordinating playback of audio content. The method includes at least some operations performed by the a/V hub.
Another embodiment provides one or more electronic devices.
A second group of described embodiments includes an audio/video (a/V) hub. The A/V hub includes: one or more antennas; and an interface circuit that, during operation, communicates with the electronic device using wireless communication. During operation, the a/V hub receives frames from the electronic device via wireless communication. The A/V hub then stores a receive time when the frame was received, where the receive time is based on a clock in the A/V hub. Further, the a/V hub calculates a current time offset between a clock in the electronic device and a clock in the a/V hub based on the time of receipt of the frame and an expected time of transmission, wherein the expected time of transmission is based on coordination of the clock in the electronic device and the clock in the a/V hub at a previous time and a predefined transmission schedule for the frame. Next, the a/V hub transmits one or more frames including the audio content and playback timing information to the electronic device, wherein the playback timing information specifies a playback time at which the electronic device is to play back the audio content based on the current time offset. Furthermore, the playback times of the electronic devices have a temporal relationship, thereby coordinating the playback of audio content by the electronic devices.
Note that the temporal relationship may have non-zero values, instructing at least some electronic devices to playback audio content having a phase relative to each other by using different values of playback time. For example, the different playback times may be based on an acoustic characterization of the environment that includes the electronic device and the a/V hub. Alternatively or additionally, the different playback times may be based on desired acoustic characteristics in the environment.
In some embodiments, the electronic device is located at a vector distance from the a/V hub, and the interface circuit determines a magnitude of the vector distance using wireless ranging, a frame-based transmission time, and a reception time. Further, the interface circuit may determine an angle of the vector distance based on an angle of arrival of a wireless signal associated with a frame received by the one or more antennas during the wireless communication. Further, the different playback times may be based on the determined vector distance.
Alternatively or additionally, the different playback times are based on an estimated position of the listener relative to the electronic device. For example, the interface circuit may: communicating with another electronic device; and calculating an estimated location of the listener based on the communication with the other electronic device. Further, the a/V hub may include an acoustic wave sensor that performs acoustic measurements of the environment including the a/V hub, and the a/V hub may calculate an estimated location of the listener based on the acoustic measurements. Further, the interface circuit may communicate with other electronic devices in the environment and may receive additional sound measurements of the environment from the other electronic devices. In these embodiments, the a/V hub calculates the estimated location of the listener based on the additional sound measurements. In some embodiments, the interface circuit: performing a time-of-flight measurement; and calculates an estimated position of the listener from the time-of-flight measurements.
Note that during the initialization mode of operation, coordination of the clock in the electronic device and the clock in the a/V hub may have occurred.
Further, the current time offset may be based on a model of clock drift in the electronic device.
Another embodiment provides a computer readable storage medium for use with an a/V hub. The computer readable storage medium includes program modules that, when executed by the a/V hub, cause the a/V hub to perform at least some of the operations described above.
Another embodiment provides a method for coordinating playback of audio content. The method includes at least some operations performed by the a/V hub.
Another embodiment provides one or more electronic devices.
This summary is provided merely for the purpose of illustrating some exemplary embodiments in order to provide a basic understanding of some aspects of the subject matter described herein. Accordingly, it should be understood that the above-described features are merely examples and should not be construed to narrow the scope or spirit of the subject matter described herein in any way. Other features, aspects, and advantages of the subject matter described herein will become apparent from the following detailed description, the drawings, and the claims.
Drawings
Fig. 1 is a block diagram illustrating a system having an electronic device according to an embodiment of the present disclosure.
Fig. 2 is a flow diagram illustrating a method for coordinating playback of audio content in accordance with an embodiment of the present disclosure.
Fig. 3 is a diagram illustrating communication between the electronic devices in fig. 1 according to an embodiment of the present disclosure.
Fig. 4 is a diagram illustrating coordination of playback of audio content by the electronic device of fig. 1, according to an embodiment of the present disclosure.
Fig. 5 is a flow diagram illustrating a method for coordinating playback of audio content in accordance with an embodiment of the present disclosure.
Fig. 6 is a diagram illustrating communication between the electronic devices in fig. 1 according to an embodiment of the present disclosure.
Fig. 7 is a diagram illustrating coordination of playback of audio content by the electronic device of fig. 1, according to an embodiment of the present disclosure.
Fig. 8 is a flow diagram illustrating a method for coordinating playback of audio content in accordance with an embodiment of the present disclosure.
Fig. 9 is a diagram illustrating communication between the electronic devices in fig. 1 according to an embodiment of the present disclosure.
Fig. 10 is a diagram illustrating coordination of playback of audio content by the electronic device of fig. 1, according to an embodiment of the present disclosure.
Fig. 11 is a flow diagram illustrating a method for selectively determining one or more acoustic characteristics of an environment in accordance with an embodiment of the present disclosure.
Fig. 12 is a diagram illustrating communication between the electronic devices in fig. 1 according to an embodiment of the present disclosure.
Fig. 13 is a diagram illustrating selective acoustic characterization of an environment including the electronic device of fig. 1, according to an embodiment of the present disclosure.
Fig. 14 is a flowchart illustrating a method for calculating an estimated position according to an embodiment of the present disclosure.
Fig. 15 is a diagram illustrating communication between the electronic devices in fig. 1 according to an embodiment of the present disclosure.
FIG. 16 is a diagram illustrating calculating an estimated location of one or more listeners relative to the electronic device of FIG. 1 according to an embodiment of the present disclosure.
Fig. 17 is a flow diagram illustrating a method for aggregating electronic devices, according to an embodiment of the present disclosure.
Fig. 18 is a diagram illustrating communication between the electronic devices in fig. 1 according to an embodiment of the present disclosure.
Fig. 19 is a diagram illustrating aggregating the electronic devices in fig. 1, according to an embodiment of the present disclosure.
Fig. 20 is a flow chart illustrating a method for determining equalized audio content according to an embodiment of the present disclosure.
Fig. 21 is a diagram illustrating communication between the electronic devices in fig. 1 according to an embodiment of the present disclosure.
Fig. 22 is a diagram illustrating determining equalized audio content using the electronic device of fig. 1 according to an embodiment of the present disclosure.
Fig. 23 is a block diagram illustrating one of the electronic devices of fig. 1 in accordance with an embodiment of the present disclosure.
Note that like reference numerals refer to corresponding parts throughout the drawings. Further, multiple instances of the same part are designated by a common prefix separated from the instance number by a dashed line.
Detailed Description
In a first set of embodiments, an audio/video (A/V) hub that coordinates playback of audio content is described. In particular, the a/V hub may calculate a current time offset between a clock in an electronic device (e.g., an electronic device including a speaker) and a clock in the a/V hub based on a difference between a transmission time of a frame from the electronic device and a reception time when the frame is received. For example, the current time offset may be calculated using wireless ranging by ignoring the distance between the a/V hub and the electronic device. The a/V hub may then transmit one or more frames including the audio content and playback timing information to the electronic device, which may specify a playback time at which the electronic device is to play back the audio content based on the current time offset. Further, the playback times of the electronic devices may have a temporal relationship, thereby coordinating playback of the audio content by the electronic devices.
By coordinating playback of audio content by electronic devices, coordination techniques may provide an improved acoustic experience in an environment that includes an a/V hub and electronic devices. For example, the coordination technique may correct for clock drift between the a/V hub and the electronic device. Alternatively or additionally, the coordination technique may correct or adapt to acoustic characteristics of the environment and/or based on desired acoustic characteristics in the environment. Additionally, the coordination technique may correct the playback time based on an estimated position of the listener relative to the electronic device. In these ways, the coordination techniques may improve acoustic quality and, more generally, improve the user experience when using the a/V hub and the electronic device. Thus, the coordination techniques may improve customer loyalty and revenue for the providers of the a/V hub and the electronic devices.
In a second set of embodiments, an audio/video (A/V) hub is described that selectively determines one or more acoustic characteristics of an environment that includes the A/V hub. In particular, the a/V hub may use wireless communication to detect electronic devices (e.g., electronic devices including speakers) in the environment. The A/V hub may then determine a change condition, such as when the electronic device and/or a change in the location of the electronic device has not been previously detected in the environment. In response to determining the changed condition, the A/V hub may transition to a characterization mode. During the characterization mode, the A/V hub may: providing instructions to the electronic device to play back the audio content at the specified playback time; determining one or more acoustic characteristics of the environment based on acoustic measurements in the environment; and storing the one or more acoustic characteristics and/or the location of the electronic device in a memory.
By selectively determining one or more acoustic characteristics, the characterization techniques may facilitate an improved acoustic experience in an environment that includes the a/V hub and the electronic device. For example, the characterization technique can identify changes and characterize a changed environment, which can then be used to correct the effects of the changes during playback of audio content by one or more electronic devices, including electronic devices. In these ways, the characterization techniques may improve acoustic quality and, more generally, improve the user experience when using the a/V hub and the electronic device. Accordingly, the characterization techniques may improve customer loyalty and revenues for the providers of the a/V hub and the electronic device.
In a third set of embodiments, an audio/video (A/V) hub that coordinates playback of audio content is described. In particular, the a/V hub may calculate a current time offset between a clock in the electronic device (e.g., an electronic device including a speaker) and a clock in the a/V hub based on the measured sound corresponding to the one or more acoustic characterization patterns, one or more times when the electronic device outputs the sound, and the one or more acoustic characterization patterns. The a/V hub may then transmit one or more frames including the audio content and playback timing information to the electronic device, which may specify a playback time at which the electronic device is to play back the audio content based on the current time offset. Further, the playback times of the electronic devices may have a temporal relationship to coordinate the playback of the audio content by the electronic devices.
By coordinating playback of audio content by electronic devices, coordination techniques may provide an improved acoustic experience in an environment that includes an a/V hub and electronic devices. For example, the coordination technique may correct for clock drift between the a/V hub and the electronic device. Alternatively or additionally, the coordination technique may correct or adapt to acoustic characteristics of the environment and/or based on desired acoustic characteristics in the environment. Additionally, the coordination technique may correct the playback time based on an estimated position of the listener relative to the electronic device. In these ways, the coordination techniques may improve acoustic quality and, more generally, improve the user experience when using the a/V hub and the electronic device. Thus, the coordination techniques may improve customer loyalty and revenue for the providers of the a/V hub and the electronic devices.
In a fourth set of embodiments, an audio/video (A/V) hub that calculates an estimated location is described. In particular, the a/V hub may calculate an estimated location of a listener relative to an electronic device (e.g., an electronic device including speakers) in an environment including the a/V hub and the electronic device based on: communication with another electronic device; sound measurement in the environment; and/or time-of-flight measurements. The a/V hub may then transmit one or more frames including the audio content and playback timing information to the electronic device, which may specify a playback time at which the electronic device is to playback the audio content based on the estimated location. Further, the playback times of the electronic devices may have a temporal relationship to coordinate the playback of the audio content by the electronic devices.
By calculating an estimated location of a listener, the characterization techniques may facilitate an improved acoustic experience in an environment that includes an a/V hub and an electronic device. For example, the characterization techniques may track changes in the location of a listener in the environment, which may then be used to correct or adapt the playback of audio content by one or more electronic devices. In these ways, the characterization techniques may improve acoustic quality and, more generally, improve the user experience when using the a/V hub and the electronic device. Accordingly, the characterization techniques may improve customer loyalty and revenues for the providers of the a/V hub and the electronic device.
In a fifth set of embodiments, an audio/video (A/V) hub for a converged electronic device is described. In particular, the a/V hub may measure sounds output by an electronic device (e.g., an electronic device including speakers) that correspond to audio content. The a/V hub may then aggregate the electronic devices into two or more subsets based on the measured sounds. Further, the a/V hub may determine playback timing information for the subset, which may specify a playback time when the electronic devices in the given subset are to play back the audio content. Next, the a/V hub may transmit one or more frames including the audio content and playback timing information to the electronic devices, wherein at least the playback times of the electronic devices of the given subset have a temporal relationship to coordinate the playback of the audio content by the electronic devices of the given subset.
By aggregating electronic devices, characterization techniques may facilitate an improved acoustic experience in an environment that includes an a/V hub and electronic devices. For example, the characterization techniques may aggregate electronic devices based on: different audio content; measuring an acoustic delay of the sound; and/or desired acoustic properties in the environment. In addition, the a/V hub may determine playback volumes for the subsets to use when playing back audio content in order to reduce acoustic crosstalk between two or more subsets. In these ways, the characterization techniques may improve acoustic quality and, more generally, improve the user experience when using the a/V hub and the electronic device. Accordingly, the characterization techniques may improve customer loyalty and revenues for the providers of the a/V hub and the electronic device.
In a sixth set of embodiments, an audio/video (A/V) hub is described that determines equalized audio content. In particular, the a/V hub may measure sounds output by an electronic device (e.g., an electronic device including speakers) that correspond to audio content. The a/V hub may then compare the measured sound to an expected acoustic characteristic at the first location in the environment based on the first location, the second location of the a/V hub, and an acoustic transfer function of the environment in the at least one frequency band. For example, the comparison may include: the method further includes calculating an acoustic transfer function at the first location based on acoustic transfer functions at other locations in the environment, and correcting the measurement sound based on the calculated acoustic transfer function at the first location. Further, the A/V hub may determine equalized audio content based on the comparison and the audio content. Next, the a/V hub may transmit one or more frames including the equalized audio content to the electronic device to facilitate the electronic device to output additional sound corresponding to the equalized audio content.
By determining equalized audio content, signal processing techniques may facilitate an improved acoustic experience in an environment that includes an a/V hub and electronic devices. For example, the signal processing techniques may dynamically change the audio content based on an estimated location of a listener relative to a location of the electronic device and an acoustic transfer function of the environment in the at least one frequency band. This may allow desired acoustic characteristics or a class of audio playback (e.g., mono, stereo, or multi-channel) to be achieved at an estimated location in the environment. In these ways, signal processing techniques may improve acoustic quality and, more generally, improve user experience when using an a/V hub and electronic device. Thus, signal processing techniques may improve customer loyalty and revenues for providers of a/V hubs and electronic devices.
In a seventh set of embodiments, an audio/video (A/V) hub that coordinates playback of audio content is described. In particular, the a/V hub may calculate a current time offset between a clock in an electronic device (e.g., an electronic device including a speaker) and a clock in the a/V hub based on a difference between a time of receipt of a frame from the electronic device and an expected time of transmission of the frame. For example, the expected transmission time may be based on coordination of a clock in the electronic device and a clock in the a/V hub at a previous time and a predefined transmission schedule for the frame. The a/V hub may then transmit one or more frames including the audio content and playback timing information to the electronic device, which may specify a playback time at which the electronic device is to play back the audio content based on the current time offset. Further, the playback times of the electronic devices may have a temporal relationship, thereby coordinating playback of the audio content by the electronic devices.
By coordinating playback of audio content by electronic devices, coordination techniques may provide an improved acoustic experience in an environment that includes an a/V hub and electronic devices. For example, the coordination technique may correct for clock drift between the a/V hub and the electronic device. Alternatively or additionally, the coordination technique may correct or adapt to acoustic characteristics of the environment and/or based on desired (or target) acoustic characteristics in the environment. Additionally, the coordination technique may correct the playback time based on an estimated position of the listener relative to the electronic device. In these ways, the coordination techniques may improve acoustic quality and, more generally, improve the user experience when using the a/V hub and the electronic device. Thus, the coordination techniques may improve customer loyalty and revenue for the providers of the a/V hub and the electronic devices.
In the discussion that follows, an a/V hub (sometimes referred to as a "coordinating device"), an a/V display device, a portable electronic device, one or more receiver devices, and/or one or more electronic devices (e.g., speakers, and more generally consumer electronic devices) may include radios that communicate packets or frames according to one or more communication protocols, such as: the Institute of Electrical and Electronics Engineers (IEEE)802.11 standard (sometimes referred to as the Institute of Electrical and Electronics Engineers (IEEE) standardFrom Austin, TexasAlliance)(from the bluetooth special interest group, corchland, washington), cellular telephone communication protocols, near field communication standards or specifications (from the NFC forum, weckfield, ma), and/or other types of wireless interfaces. For example, a cellular telephone communication protocol may include or may be compatible with the following: a second generation mobile telecommunications technology, a third generation mobile telecommunications technology (e.g., a communication protocol that conforms to the international mobile telecommunications-2000 specification of the geneva international telecommunications union, switzerland), a fourth generation mobile telecommunications technology (e.g., a communication protocol that conforms to the international mobile telecommunications advanced specification of the geneva international telecommunications union, switzerland), and/or another cellular telephone communication technology. In some embodiments, the communication protocol comprises long term evolution or LTE. However, various communication protocols (e.g., ethernet) may be used. Additionally, communication may occur via various frequency bands. Note that the portable electronic device, the A/V hub, the A/V display device, and/or one or more of the electronic devices may use infrared communicationsInfrared communications compatible with a communication standard, including one-way infrared communications or two-way infrared communications.
Further, the A/V content in the following discussion may include video and associated audio (e.g., music, sounds, conversations, etc.), including only video or only audio.
Communication between electronic devices is illustrated in FIG. 1, which presents a block diagram illustrating a system 100 having a portable electronic device 110 (e.g., a remote control or cellular telephone), one or more A/V hubs (e.g., A/V hub 112), one or more A/V display devices 114 (e.g., a television, monitor, computer, more generally, a display associated with the electronic device), one or more receiver devices (e.g., receiver device 116, e.g., a local wireless receiver associated with a proximate A/V display device 114-1, which may receive frame-by-frame transcoded A/V content from A/V hub 112 for display on A/V display device 114-1), one or more speakers 118 (and more generally, one or more electronic devices including one or more speakers), and/or associated with one or more content providers For example, a radio receiver, a video player, a satellite receiver, an access point providing connectivity to a limited network such as the internet, media or content sources, consumer electronics devices, entertainment devices, set-top boxes, top-level content delivered over the internet or network without the involvement of a cable, satellite or multi-system operator, security cameras, surveillance cameras, etc. Note that A/V hub 112, A/V display device 114, receiver device 116, and speaker 118 are sometimes collectively referred to as "components" in system 100. However, A/V hub 112, A/V display device 114, receiver device 116, and/or speakers 118 are sometimes referred to as "electronic devices".
In particular, portable electronic devices 110 and A/V hub 112 may communicate with each other using wireless communication, and one or more other components in system 100 (e.g., at least one of A/V display devices 114, receiver device 116, one of speakers 118, and/or one of content sources 120) may communicate using wireless communication and/or wired communication. During wireless communication, these electronic devices may wirelessly communicate: sending advertising frames on the wireless channel, detecting each other by scanning the wireless channel, establishing a connection (e.g., by sending an association request), and/or sending and receiving packets or frames (which may include related requests and/or additional information as payload, such as information specifying communication capabilities, data, user interfaces, a/V content, etc.).
As described further below with reference to fig. 23, portable electronic device 110, a/V hub 112, a/V display device 114, sink device 116, speaker 118, and content source 120 may include subsystems such as: a network subsystem, a memory subsystem, and a processor subsystem. Further, one or more of portable electronic device 110, a/V hub 112, receiver device 116 and/or speakers 118, and optionally a/V display device 114 and/or content source 120, may include a radio 122 in a network subsystem. For example, the radio or receiver device may be located in an A/V display device, e.g., radio 122-5 is included in A/V display device 114-2. Further, note that radios 122 may be instances of the same radio or may be different from each other. More generally, portable electronic device 110, a/V hub 112, sink device 116, and/or speaker 118 (and optionally a/V display device 114 and/or content source 120) may comprise or be included within any electronic device having a network subsystem for enabling portable electronic device 110, a/V hub 112, sink device 116, and/or speaker 118 (and optionally a/V display device 114 and/or content source 120) to wirelessly communicate with one another. The wireless communication may include: advertisements (advertisements) are sent over the wireless channel to enable the electronic devices to make initial contact or detect each other, and then subsequent data/management frames (e.g., association requests and responses) are exchanged to establish a connection, configure security options (e.g., internet protocol security), send and receive packets or frames over the connection, and so on.
As can be seen in FIG. 1, a wireless signal 124 (represented by a sawtooth line) is transmitted from the radio 122-1 in the portable electronic device 110. The wireless signals may be received by at least one of: a/V hub 112, sink device 116, and/or at least one speaker 118 (and optionally one or more of a/V display device 114 and/or content source 120). For example, the portable electronic device 110 may transmit the packet. These packets, in turn, may be received by radio 122-2 in a/V hub 112. This may allow the portable electronic device 110 to communicate information to the a/V hub 112. Although fig. 1 shows the portable electronic device 110 sending packets, the portable electronic device 110 may also receive packets from the a/V hub 112 and/or one or more other components in the system 100. More generally, wireless signals may be transmitted and/or received by one or more components in system 100.
In the depicted embodiment, the processing of packets or frames in portable electronic device 110, a/V hub 112, sink device 116, and/or speaker 118 (and optionally one or more of a/V display device 114 and/or content source 120) includes: receiving a wireless signal 124 having packets or frames; decoding/extracting a packet or frame from the received wireless signal 124 to obtain a packet or frame; and processing the packets or frames to determine information contained in the packets or frames (e.g., information associated with the data flow). For example, the information from the portable electronic device 110 may include: user interface activity information associated with a user interface displayed on a Touch Sensitive Display (TSD)128 in the portable electronic device 110, the user interface being used by a user of the portable electronic device 110 to control at least: at least one of a/V hub 112, a/V display device 114, at least one of speakers 118, and/or at least one of content sources 120. (in some embodiments, portable electronic device 110 includes, instead of or in addition to touch-sensitive display 128, a user interface having physical knobs and/or buttons that a user may use to control at least a/V hub 112, one of a/V display devices 114, at least one of speakers 118, and/or one of content sources 120). Alternatively, information from portable electronic device 110, A/V hub 112, one or more A/V display devices 114, receiver device 116, one or more speakers 118, and/or one or more content sources 120 may specify communication capabilities related to communications between portable electronic device 110 and one or more other components in system 100. Further, information from A/V hub 112 may include: device status information (e.g., on, off, play, rewind, fast-forward, selected channel, selected a/V content, content source, etc.) relating to a current device status of one of the at least one a/V display device 114, the at least one speaker 118, and/or the content source 120, or may include: user interface information for the user interface (which may be dynamically updated based on device state information and/or user interface activity information). Further, information from at least one of a/V hub 112 and/or content source 120 may include: audio and/or video (which is sometimes denoted as "audio/video" or "a/V" content) displayed or presented on one or more a/V display devices 114, as well as display instructions that specify how the audio and/or video is to be displayed or presented.
However, as previously described, audio and/or video may be communicated between components in the system 100 via wired communication. Thus, as shown in FIG. 1, there may be a wired cable or link, such as a High Definition Multimedia Interface (HDMI) cable 126, for example, between A/V hub 112 and A/V display device 114-3. While audio and/or video may be included in or associated with HDMI content, in other embodiments, audio content may be included in or associated with a/V content that is compatible with another format or standard used in embodiments of the disclosed communication technology. For example, the A/V content may include or may be compatible with the following formats: h.264, MPEG-2, QuickTime video format, MPEG-4, MP4, and/or TCP/IP. Further, the video mode of the a/V content may be 720p, 1080i, 1080p, 1440p, 2000, 2160p, 2540p, 4000p, and/or 4320 p.
Note that A/V hub 112 may determine display instructions (with a display layout) for A/V content based on a format of a display in one of A/V display devices 114 (e.g., A/V display device 114-1). Alternatively, A/V hub 112 may use predetermined display instructions, or A/V hub 112 may modify or transform A/V content based on the display layout such that the modified or transformed A/V content is in an appropriate format for display on the display. Further, the display instructions may specify information to be displayed on a display in A/V display device 114-1, including a location to display A/V content (e.g., in a central window, in a tiled window, etc.). Thus, the information to be displayed (i.e., an instance of the display instruction) may be based on the format of the display, for example: display size, display resolution, display aspect ratio, display contrast, display type, etc. Further, note that when A/V hub 112 receives A/V content from one of content sources 120, A/V hub 112 may provide the A/V content and display instructions as frames to A/V display device 114-1 (where the A/V content is received (e.g., in real-time) from one of content sources 120) for display of the A/V content on a display in A/V display device 114-1. For example, A/V hub 112 may collect A/V content in a buffer until a frame is received, and A/V hub 112 may then provide the complete frame to A/V display device 114-1. Alternatively, A/V hub 112 may provide packets having portions of frames to A/V display device 114-1 upon receiving the packets. In some embodiments, display instructions may be provided to a/V display device 114-1 differently (e.g., as display instructions change), regularly or periodically (e.g., in one of every N packets or in packets in each frame), or in each packet.
Further, note that communications between portable electronic device 110, a/V hub 112, one or more a/V display devices 114, receiver device 116, one or more speakers 118, and/or one or more content sources 120 may be characterized by various performance metrics, such as: received Signal Strength Indicator (RSSI), data rate, etc. that mirrors radio protocol overhead (sometimes referred to as "throughput"), error rate (e.g., packet error rate, retry rate, or retransmission rate), mean square error of the equalized signal with respect to the equalization target, intersymbol interference, multipath interference, signal-to-noise ratio, eye diagram width, the ratio of the number of bytes successfully transmitted over a time interval (e.g., 1 to 10 seconds) to the estimated maximum number of bytes that can be transmitted over the time interval (the latter sometimes referred to as the 'capacity' of the channel or link), and/or the ratio of the actual data rate to the estimated maximum data rate (sometimes referred to as "utilization"). In addition, performance associated with different channels during communication may be monitored (e.g., to identify dropped packets) individually or jointly.
Communication between portable electronic device 110, a/V hub 112, one of a/V display devices 114, receiver device 116, speaker 118, and/or one or more content sources 120 in fig. 1 may involve one or more independent concurrent data streams in different wireless channels (or even different communication protocols, such as different Wi-Fi communication protocols) in one or more connections or links that may communicate using multiple radios. Note that one or more connections or links may each have a separate or different identifier (e.g., a different service set identifier) on a wireless network (which may be a private network or a public network) in system 100. Furthermore, one or more concurrent data streams may be partially or fully redundant on a dynamic or packet-by-packet basis to improve or maintain performance metrics even in the presence of transient changes (e.g., interference, changes in the amount of information that needs to be communicated, movement of the portable electronic device 110, etc.) and to facilitate various services (while remaining compatible with communication protocols (e.g., Wi-Fi communication protocols)) such as: channel calibration, determining one or more performance indicators, performing quality of service characterization without interrupting communication (e.g., performing channel estimation, determining link quality, performing channel calibration, and/or performing spectral analysis associated with at least one channel), seamless switching between different wireless channels, coordinated communication between components, and the like. These features may reduce the number of retransmitted packets and, thus, may reduce latency and avoid communication interruptions, and may enhance the experience of one or more users viewing a/V content on one or more a/V display devices 114 and/or listening to audio output by one or more speakers 118.
As previously described, a user may control at least a/V hub 112, at least one a/V display device 114, at least one speaker 118, and/or at least one content source 120 via a user interface displayed on touch-sensitive display 128 on portable electronic device 110. In particular, at a given time, the user interface may include one or more virtual icons that allow the user to activate, deactivate, or change at least the following functions or capabilities: a/V hub 112, at least one a/V display device 114, at least one speaker 118, and/or at least one content source 120. For example, a given virtual icon in the user interface may have an associated tap region on the surface of the touch-sensitive display 128. If the user makes contact with the surface and then breaks contact with the surface within the tap region (e.g., using one or more fingers or toes, or using a stylus), the portable electronic device 110 (e.g., a processor executing program modules) may receive a user interface activity notification indicating activation of a command or instruction from a touch screen input/output (I/O) controller coupled to the touch-sensitive display 128. In these embodiments, the user may maintain contact with touch-sensitive display 128 with an average contact pressure that is typically less than a threshold (e.g., 10-20kPa), and may activate a given virtual icon by increasing the average contact pressure with touch-sensitive display 128 above the threshold, in response, program modules may instruct interface circuitry in portable electronic device 110 to wirelessly transmit user interface activity information indicative of a command or instruction to A/V hub 112, and A/V hub 112 may transmit the command or instruction to a target component in system 100 (e.g., A/V display device 114-1). The instructions or commands can cause the a/V display device 114-1 to turn on or off, display a/V content from a particular content source, perform trick mode operations (e.g., fast forward, reverse, fast reverse, or skip), and so forth. For example, A/V hub 112 may request A/V content from content source 120-1 and may then provide the A/V content to A/V display device 114-1 along with display instructions for A/V display device 114-1 to display the A/V content. Alternatively or additionally, a/V hub 112 may provide audio content associated with the video content from content source 120-1 to one or more of speakers 118.
As previously mentioned, achieving high audio quality in an environment (e.g., a room, building, vehicle, etc.) is often challenging. In particular, achieving high audio quality in an environment often places strong restrictions on the coordination of speakers (e.g., speaker 118). For example, coordination may require maintaining an accuracy of 1-5 μ s (this is a non-limiting exemplary value). In some embodiments, the coordination includes synchronization in the time domain within time or phase accuracy and/or synchronization in the frequency domain within frequency accuracy. Without proper coordination, the acoustic quality in the environment may degrade when listening to audio content and/or a/V content, with a corresponding impact on listener satisfaction and overall user experience.
This challenge may be addressed in a coordination technique by directly or indirectly coordinating speakers 118 with a/V hub 112. As described below with reference to fig. 2-4, in some embodiments, wireless communication may be used to facilitate coordinated playback of audio content by speakers 118. In particular, because the optical speed is approximately six orders of magnitude faster than the acoustic speed, the propagation delay of the wireless signal in the environment (e.g., a room) is negligible with respect to the desired accuracy of coordination of the speaker 118. For example, the desired accuracy of the coordination of the speakers 118 may be on the order of microseconds, while the propagation delay in a typical room (e.g., a distance of up to 10 meters to 30 meters) may be one or two orders of magnitude smaller. Accordingly, techniques such as wireless ranging or radio-based distance measurement may be used to coordinate the speakers 118. Specifically, during wireless ranging, a/V hub 112 may transmit a frame or packet including the time of transmission and an identifier of a/V hub 112 based on a clock in a/V hub 112, and a given one of speakers 118 (e.g., speaker 118-1) may determine the time of arrival or time of receipt of the frame or packet based on the clock in speaker 118.
Alternatively, speaker 118-1 may transmit a frame or packet (sometimes referred to as an "input frame") that includes the time of transmission and an identifier of speaker 118-1 based on a clock in speaker 118-1, and a/V hub 112 may determine the time of arrival or time of receipt of the frame or packet based on the clock in a/V hub 112. Typically, the distance between A/V hub 112 and speaker 118-1 is determined based on the product of time of flight (the difference between the arrival time and the transmission time) and the propagation velocity. However, by ignoring the physical distance between the a/V hub 112 and the speaker 118-1, i.e., by assuming instantaneous propagation (introducing negligible static offset for a fixed device in the same room or environment), the difference in arrival time and transmission time can dynamically track drift in the coordination of the clock in the a/V hub 112 and the clock in the speaker 118-1 or the current time offset (and negligible static offset).
The current time offset may be determined by a/V hub 112 or may be provided to a/V hub 112 by speaker 118-1. a/V hub 112 may then transmit one or more frames (sometimes referred to as "output frames") including the audio content and playback timing information to speaker 118-1, which may specify a playback time when speaker 118-1 is to play back the audio content based on the current time offset. This may be repeated for other speakers 118. In addition, the playback times of speakers 118 may have a temporal relationship to coordinate the playback of audio content by speakers 118.
In addition to correcting for drift in the clock, the coordination techniques (and other embodiments of the coordination techniques described below) may provide an improved acoustic experience in an environment that includes the a/V hub 112 and the speakers 118. For example, the coordination technique may correct or adapt predetermined or dynamically determined acoustic characteristics of the environment (as further described below with reference to fig. 11-13) based on desired acoustic characteristics in the environment (e.g., type of playback such as mono, stereo, and/or multi-channel, acoustic radiation patterns such as directional or diffuse, intelligibility, etc.) and/or based on a dynamically estimated location of one or more listeners relative to the speakers 118 (as further described below with reference to fig. 14-16). Additionally, the coordination technique may be used in conjunction with: the speakers 118 are dynamically aggregated into groups (as described further below with reference to fig. 17-19) and/or the audio content is dynamically equalized based on the audio content being played and the difference between the acoustic characteristics and the desired acoustic characteristics in the environment (as described further below with reference to fig. 20-22).
Note that wireless ranging (and in general wireless communication) may be performed at or in one or more frequency bands, for example at or in: 2GHz wireless frequency band, 5GHz wireless frequency band, ISM frequency band, 60GHz wireless frequency band, ultra wide band and the like.
In some embodiments, one or more additional communication techniques may be used to identify and/or exclude multipath wireless signals during the coordination of speakers 118. For example, a/V hub 112 and/or speakers 118 may determine an angle of arrival (including non-line-of-sight reception) using: directional antennas, differential time of arrival at an antenna array with a known position, and/or angle of arrival to two receivers with known positions (i.e., trilateration or multilateration).
Another method for coordinating speakers 118 may use a predetermined transmission time, as described further below with reference to fig. 5-7. In particular, during the calibration mode, the clocks in the A/V hub 112 and speakers 118 may be coordinated. Subsequently, in a normal operating mode, A/V hub 112 may transmit a frame or packet having the identifier of A/V hub 112 at a predetermined transmit time based on a clock in A/V hub 112. However, due to relative drift in the clock in the a/V hub 112, these packets or frames will arrive at the speaker 118 or be received at the speaker 118 at a time that is different from the expected scheduled transmission time based on the clock in the speaker 118. Thus, by again ignoring the propagation delay, the difference between the arrival time and the scheduled transmission time of a given frame at a given one of the speakers 118 (e.g., speaker 118-1) may dynamically track drift in the coordination of the clock in the a/V hub 112 and the clock in speaker 118-1 or the current time offset (and negligible static offset associated with the propagation delay).
Alternatively or additionally, after the calibration mode, the speaker 118 may transmit a frame or packet with an identifier of the speaker 118 at a predetermined transmission time based on a clock in the speaker 118. However, due to drift in the clock in the speaker 118, these packets or frames will arrive at the a/V hub 112 or be received by the a/V hub 112 at a time that is different from the expected scheduled transmission time based on the clock in the a/V hub 112. Thus, by again ignoring propagation delays, the difference between the arrival time and the scheduled transmission time for a given frame from a given one of the speakers 118 (e.g., speaker 118-1) may dynamically track drift in the coordination of the clock in the a/V hub 112 and the clock in speaker 118-1 or the current time offset (and negligible static offset associated with propagation delays).
Again, the current time offset may be determined by the a/V hub 112 or may be provided to the a/V hub 112 by one or more speakers 118 (e.g., speaker 118-1). Note that in some embodiments, the current time offset is also based on a clock drift model in the a/V hub 112 and speakers 118. a/V hub 112 may then transmit one or more frames including the audio content and playback timing information to speaker 118-1, which may specify a playback time when speaker 118-1 is to play back the audio content based on the current time offset. This may be repeated for other speakers 118. In addition, the playback times of speakers 118 may have a temporal relationship to coordinate the playback of audio content by speakers 118.
Further, note that in these embodiments, one or more additional communication techniques may also be used to identify and/or exclude multipath wireless signals during the coordination of the speaker 118.
Another method for coordinating the speakers 118 may use acoustic measurements, as described further below with reference to fig. 8-10. In particular, during the calibration mode, the clocks in the A/V hub 112 and speakers 118 may be coordinated. Subsequently, the a/V hub 112 may output a sound corresponding to an acoustically characterized pattern (e.g., a pulse sequence, different frequency, etc.) that uniquely identifies the a/V hub 112 at a predetermined transmission time. The acoustic characterization pattern may be output at a frequency outside of the human hearing range (e.g., at an ultrasonic frequency). However, due to relative drift in the clock in the a/V hub 112, sound corresponding to the acoustic characterization pattern will be measured (i.e., will arrive or be received) at the speaker 118 at a time that is different from the expected scheduled transmission time based on the clock in the speaker 118. In these embodiments, different times need to be corrected for contributions associated with acoustic propagation delays based on predetermined or known locations of a/V hub 112 and speakers 118 and/or using wireless ranging. For example, triangulation and/or trilateration may be used in a local positioning system, a global positioning system, and/or a wireless network (e.g., a cellular telephone network or a WLAN) to determine location. Thus, after correcting for acoustic propagation delays, the difference between the arrival time and the scheduled transmission time of a given frame at a given one of the speakers 118 (e.g., speaker 118-1) may dynamically track drift or current time offsets in the clocks in the coordinating a/V hub 112 and speaker 118-1.
Alternatively or additionally, after the calibration mode, the speaker 118 may output a sound corresponding to an acoustic characterization mode (e.g., a different pulse sequence, a different frequency, etc.) that uniquely identifies the speaker 118 at a predetermined transmission time. However, due to relative drift in the clock in the speaker 118, sound corresponding to the acoustic characterization pattern will be measured (i.e., will arrive or be received) at the a/V hub 112 at a time that is different from the expected scheduled transmission time based on the clock in the a/V hub 112. In these embodiments, different times need to be corrected for contributions associated with acoustic propagation delays based on predetermined or known locations of a/V hub 112 and speakers 118 and/or using wireless ranging. Thus, after correcting for acoustic propagation delays, the difference between the arrival time and the scheduled transmission time of a given frame from a given one of the speakers 118 (e.g., speaker 118-1) may dynamically track drift or current time offset in the clocks in the coordinating a/V hub 112 and speaker 118-1.
Again, the current time offset may be determined by a/V hub 112 or may be provided to a/V hub 112 by speaker 118-1. a/V hub 112 may then transmit one or more frames including the audio content and playback timing information to speaker 118-1, which may specify a playback time when speaker 118-1 is to play back the audio content based on the current time offset. This may be repeated for other speakers 118. In addition, the playback times of speakers 118 may have a temporal relationship to coordinate the playback of audio content by speakers 118.
Although we describe the network environment shown in fig. 1 as an example, in alternative embodiments, there may be a different number or type of electronic devices. For example, some embodiments include more or fewer electronic devices. As another example, in another embodiment, a different electronic device is transmitting and/or receiving packets or frames. Although portable electronic device 110 and a/V hub 112 are shown with a single radio 122, in other embodiments, portable electronic device 110 and a/V hub 112 (and optionally a/V display device 114, receiver device 116, speakers 118, and/or content source 120) may include multiple radios.
We now describe embodiments of communication techniques. Fig. 2 presents a flow diagram illustrating a method 200 for coordinating playback of audio content, which may be performed by an a/V hub, such as a/V hub 112 (fig. 1). During operation, an a/V hub (e.g., control circuitry or control logic, e.g., a processor executing a program module in the a/V hub) may receive frames or packets from one or more electronic devices via wireless communication (operation 210), where a given frame or packet includes a transmission time when the given electronic device transmitted the given frame or packet.
The a/V hub may then store a receive time when the frame or packet was received (operation 212), where the receive time is based on a clock in the a/V hub. For example, the receive time may be added to an instance of a frame or packet received from one electronic device by or associated with a physical layer and/or Media Access Control (MAC) layer in an interface circuit in the a/V hub. Note that the receive time may be associated with a leading edge or a trailing edge of a frame or packet, e.g., the receive time may be associated with a receive time signal associated with the leading edge or a receive clear signal associated with the trailing edge. Similarly, the transmission time may be added to an instance of a frame or packet transmitted by an electronic device by a physical and/or MAC layer in or associated with an interface circuit in the electronic device. In some embodiments, the transmission time and the reception time are determined and added to the frame or packet by wireless ranging capabilities in the interface circuit or in a physical layer and/or a MAC layer associated with the interface circuit.
Further, the a/V hub may calculate a current time offset between a clock in the electronic device and a clock in the a/V hub based on the reception time and the transmission time of the frame or packet (operation 214). Further, the a/V hub may calculate the current time offset based on a clock drift model in the electronic device (e.g., a circuit model of a clock circuit and/or a look-up table of clock drifts as a function of time). Note that the electronic device may be located at a non-zero distance from the a/V hub, and the current time offset may be calculated based on the time of transmission and the time of reception by ignoring the distance, using wireless ranging.
Next, the a/V hub may transmit one or more frames or packets including the audio content and playback timing information to the electronic device (operation 216), where the playback timing information specifies a playback time when the electronic device is to playback the audio content based on the current time offset. Further, the playback times of the electronic devices may have a temporal relationship, thereby coordinating playback of the audio content by the electronic devices. Note that the temporal relationship may have a non-zero value, thereby instructing at least some electronic devices to playback audio content having a phase relative to each other by using different playback time values. For example, the different playback times may be based on predetermined or dynamically determined acoustic characteristics of the environment including the electronic device and the a/V hub. Alternatively or additionally, the different playback times may be based on desired acoustic characteristics in the environment.
In some embodiments, the A/V hub optionally performs one or more additional operations (operation 218). For example, the electronic device may be located at a vector distance from the a/V hub, and the interface circuit may determine a magnitude of the vector distance based on the transmit time and the receive time using wireless ranging. Further, the interface circuit may determine the angle of the vector distance based on an angle of arrival of a wireless signal associated with a frame or packet received by the one or more antennas during the wireless communication. Further, the different playback times may be based on the determined vector distance. For example, the playback time may correspond to the determined vector distance such that sounds associated with audio content from different electronic devices at different locations in the environment may reach or achieve desired acoustic characteristics at locations in the environment having a desired phase relationship (e.g., the location of the a/V hub, in the middle of the environment, a preferred listening position for the user, etc.).
Alternatively or additionally, the different playback times are based on an estimated position of the listener relative to the electronic device, such that sounds associated with audio content from different electronic devices at different locations in the environment may reach the estimated position of the listener with a desired phase relationship or achieve desired acoustic characteristics at the estimated position. Techniques that may be used to determine the location of a listener are further described below with reference to fig. 14-16.
Note that while the wireless ranging capability in the interface circuit may involve coordinated clocks in the a/V hub and the electronic device, in other embodiments the clocks are not coordinated. Thus, various radiolocation techniques may be used. In some embodiments, wireless ranging capabilities include generating pulses of short duration (e.g., about 1ns) using transmissions over GHz or multi-GHz bandwidths.
Fig. 3 is a diagram illustrating communication between a/V hub 112 and speakers 118-1. In particular, interface circuitry 310 in speaker 118-1 may transmit one or more frames or packets (e.g., packet 312) to a/V hub 112. The packet 312 may include a corresponding transmit time 314 at which the speaker 118-1 transmits the packet 312 based on an interface clock 316, the interface clock 316 being provided by an interface clock circuit 318, the interface clock circuit 318 being located in the interface circuit 310 in the speaker 118-1 or associated with the interface circuit 310. When interface circuitry 320 in a/V hub 112 receives packet 312, it may include a receive time 322 in packet 312 (or it may store receive time 322 in memory 324), where for each packet, the respective receive time may be based on an interface clock 326 provided by interface clock circuitry 328, which interface clock circuitry 328 is located in interface circuitry 318 or is associated with interface circuitry 318.
Interface circuit 320 may then calculate a current time offset 330 between interface clock 316 and interface clock 326 based on the difference between transmit time 314 and receive time 322. Further, the interface circuit 320 may provide the current time offset 330 to the processor 332. (alternatively, processor 332 may calculate current time offset 330.)
Further, processor 332 may provide playback timing information 334 and audio content 336 to interface circuit 320, where playback timing information 334 specifies a playback time 330 at which speaker 118-1 is to play back audio content 336 based on current time offset 330. In response, interface circuit 330 may send one or more frames or packets 338 that include playback timing information 334 and audio content 336 to speaker 118-1. (however, in some embodiments, separate or distinct frames or packets are used to send playback timing information 334 and audio content 336.)
After interface circuit 310 receives one or more frames or packets 338, it may provide playback timing information 334 and audio content 336 to processor 340. Processor 340 may run software that performs playback operations 342. For example, the processor 340 may store the audio content 336 in a queue in memory. In these embodiments, playback operation 350 includes outputting audio content 336 from the queue, which includes driving an electroacoustic wave sensor in speaker 118-1 based on audio content 336 so speaker 118-1 outputs sound at the time specified by playback timing information 334.
In an exemplary embodiment, communication techniques are used to coordinate the playback of audio content by speakers 118. This is illustrated in fig. 4, which presents a diagram illustrating the playback of audio content by the coordinating speaker 118. In particular, when frames or packets (e.g., packet 410-1) are transmitted by speaker 118, they may include information specifying the time of transmission (e.g., time of transmission 412-1). For example, the physical layer in the interface circuitry in speaker 118 may include the transmission time in packet 410. Note in fig. 4 and other embodiments below that the information in a frame or packet may be included at any location (e.g., beginning, middle, and/or end).
When a packet 410 is received by a/V hub 112, additional information specifying a time of receipt (e.g., time of receipt 414-1 of packet 410-1) may be included in packet 410. For example, the physical layer in the interface circuitry in a/V hub 112 may include the time of receipt in packet 410. In addition, the time of transmission and time of reception may be used to track clock drift in the a/V hub 112 and speakers 118.
Using the transmit time and the receive time, the a/V hub 112 may calculate a current time offset between the clock in the speaker 118 and the clock in the a/V hub 112. Further, the a/V hub may calculate the current time offset based on a model of the clock drift in the speaker 118 in the a/V hub. For example, a model of relative or absolute clock drift may include a polynomial or cubic spline expression (and more generally, a function) with parameters that specify or estimate clock drift in a given speaker as a function of time based on historical time offsets.
Subsequently, a/V hub 112 may transmit one or more packets or frames to speakers 118 that include audio content 420 and playback timing information (e.g., playback timing information 418-1 in packet 416-1) that specifies the playback time when the speaker 118 devices are to play back the audio content 420 based on the current time offset. The playback times of the speakers 118 may have a temporal relationship such that the playback of the audio content 420 by the speakers 118 is coordinated, e.g., such that the associated sounds or wavefronts arrive at locations 422 in the environment having a desired phase relationship.
Another embodiment of coordination in communication technology is illustrated in fig. 5, which presents a flow diagram illustrating a method 500 for coordinating playback of audio content. Note that method 500 may be performed by an A/V hub (e.g., A/V hub 112 (FIG. 1)). During operation, an a/V hub (e.g., control circuitry or control logic such as a processor running a program module in the a/V hub) may receive frames (operation 510) or packets from an electronic device via wireless communication.
The A/V hub may then store a receive time when the frame or packet was received (operation 512), where the receive time is based on a clock in the A/V hub. For example, the receive time may be added to an instance of a frame or packet received from one electronic device by a physical and/or MAC layer in or associated with an interface circuit in the a/V hub. Note that the receive time may be associated with a leading edge or a trailing edge of a frame or packet, e.g., the receive time may be associated with a receive time signal associated with the leading edge or with a receive clear signal associated with the trailing edge.
Further, the a/V hub may calculate a current time offset between the clock in the electronic device and the clock in the a/V hub based on the time of receipt of the frame or packet and an expected time of transmission based on coordination of the clock in the electronic device and the clock in the a/V hub at a previous time and a predefined transmission schedule of the frame or packet (e.g., every 10ms or every 100ms, as non-limiting examples) (operation 514). For example, during the initialization mode, the time offset between the clock in the electronic device and the clock in the A/V hub may be eliminated (i.e., coordination may be established). Note that the scheduled transmission time in the transmission schedule may or may not include a beacon transmission time in the WLAN. Subsequently, the multiple clocks and the single clock may have relative drift, which may be tracked based on the difference between the time of receipt and the expected time of transmission of the frame or packet. In some embodiments, the a/V hub calculates the current time offset based on a clock drift model in the electronic device.
Next, the a/V hub may transmit one or more frames or packets including the audio content and playback timing information to the electronic device (operation 516), where the playback timing information specifies a playback time when the electronic device is to playback the audio content based on the current time offset. Further, the playback times of the electronic devices may have a temporal relationship, thereby coordinating playback of the audio content by the electronic devices. Note that the temporal relationship may have a non-zero value, thereby instructing at least some electronic devices to playback audio content having a phase relative to each other by using different playback time values. For example, the different playback times may be based on predetermined or dynamically determined acoustic characteristics of the environment including the electronic device and the a/V hub. Alternatively or additionally, the different playback times may be based on desired acoustic characteristics in the environment.
In some embodiments, the A/V hub optionally performs one or more additional operations (operation 518). For example, the electronic device may be located at a vector distance from the a/V hub, and the interface circuit may determine a magnitude of the vector distance based on the transmit time and the receive time using wireless ranging. Further, the interface circuit may determine the angle of the vector distance based on an angle of arrival of a wireless signal associated with a frame or packet received by the one or more antennas during the wireless communication. Further, the different playback times may be based on the determined vector distance. For example, the playback time may correspond to the determined vector distance such that sounds associated with audio content from different electronic devices at different locations in the environment may reach or achieve desired acoustic characteristics at locations in the environment having a desired phase relationship (e.g., the location of the a/V hub, in the middle of the environment, a preferred listening position for the user, etc.).
Alternatively or additionally, the different playback times are based on an estimated location of the listener relative to the electronic device, such that sounds associated with audio content from different electronic devices at different locations in the environment may reach or obtain desired acoustic characteristics at the estimated location of the listener with a desired phase relationship. Techniques that may be used to determine the location of a listener are further described below with reference to fig. 14-16.
Fig. 6 is a diagram illustrating communications between portable electronic device 110, a/V hub 112, and speakers 118-1. Specifically, during the initialization mode, interface circuitry 610 in A/V hub 112 may send frames or packets 612 to interface circuitry 614 in speaker 118-1. The packet may include information 608 that coordinates a clock 628 and a clock 606 provided by interface clock circuits 616 and 618, respectively. For example, this information may eliminate time offsets between interface clock circuits 616 and 618 and/or may set interface clock circuits 616 and 618 to the same clock frequency.
Interface circuit 614 may then transmit one or more frames or packets (e.g., packet 620) to a/V hub 112 at a predetermined transmit time 622.
When the interface circuit 610 in the a/V hub 112 receives a packet 620, it may include a receive time 624 in the packet 620 (or it may store the receive time 624 in a memory 626), where for each packet the respective receive time may be based on an interface clock 628 provided by the interface clock circuit 616, the interface clock circuit 616 being located in the interface circuit 610 or associated with the interface circuit 610.
The interface circuit 610 may then calculate a current time offset 630 between the interface clock 628 and the interface clock 606 based on the difference between the transmit time 622 and the receive time 624. Further, interface circuit 610 may provide current time offset 630 to processor 632. (alternatively, processor 632 may calculate the current time offset 630.)
Further, the processor 632 may provide playback timing information 634 and audio content 636 to the interface circuit 610, where the playback timing information 634 specifies a playback time 630 at which the speaker 118-1 is to play back the audio content 636 based on the current time offset 630. In response, the interface circuit 610 may send one or more frames or packets 638 that include the playback timing information 634 and the audio content 636 to the speaker 118-1. (however, in some embodiments, separate or distinct frames or packets are used to send playback timing information 634 and audio content 636.)
After the interface circuit 614 receives one or more frames or packets 638, it may provide playback timing information 634 and audio content 636 to the processor 640. The processor 640 may run software that performs the playback operation 642. For example, the processor 640 may store the audio content 636 in a queue in memory. In these embodiments, playback operation 650 includes outputting audio content 636 from the queue, which includes driving an electroacoustic wave sensor in speaker 118-1 based on the audio content 636 so that speaker 118-1 outputs sound at the time specified by playback timing information 634.
In an exemplary embodiment, communication techniques are used to coordinate the playback of audio content by speakers 118. This is illustrated in fig. 7, which presents a diagram illustrating the playback of audio content by the coordinating speaker 118. In particular, a/V hub 112 may send a frame or packet 710 to speaker 118 with information provided by a clock circuit for coordinating clocks in a/V hub 112 and speaker 118 (e.g., information 708 in packet 710-1).
Speaker 118 may then transmit frame or packet 712 to a/V hub 112 at a predetermined transmission time. When a/V hub 112 receives these frames or packets, information specifying the time of receipt (e.g., time of receipt 714-1 in packet 712-1) may be included in packet 712. The predetermined transmit and receive times may be used to track clock drift in the a/V hub 112 and speakers 118.
Using the predetermined transmit time and receive time, a/V hub 112 may calculate a current time offset between the clock in speaker 118 and the clock in a/V hub 112. Further, the a/V hub may calculate the current time offset based on a model of the clock drift in the speaker 118 in the a/V hub. For example, a model of relative or absolute clock drift may include a polynomial or cubic spline expression (and more generally, a function) with various parameters that specify or estimate clock drift in a given speaker as a function of time based on historical time offsets.
Subsequently, a/V hub 112 may transmit one or more frames or packets including audio content 720 and playback timing information (e.g., playback timing information 718-1 in packet 716-1) to speakers 118, where the playback timing information specifies a playback time when the speaker 118 devices are to play back the audio content 720 based on the current time offset. The playback times of the speakers 118 may have a temporal relationship such that the playback of the audio content 720 by the speakers 118 is coordinated, e.g., such that the associated sounds or wavefronts arrive at locations 722 in the environment having a desired phase relationship.
Another embodiment of coordination in communication technology is illustrated in fig. 8, which presents a flow diagram illustrating a method 800 for coordinating playback of audio content. Note that method 800 may be performed by an A/V hub (e.g., A/V hub 112 (FIG. 1)). During operation, an a/V hub (e.g., control circuitry or control logic such as a processor running a program module in the a/V hub) may measure sound output by an electronic device in an environment including the a/V hub using one or more acoustic wave sensors in the a/V hub (operation 810), where the sound corresponds to one or more acoustic characterization patterns. For example, the measurement sound may include a sound pressure. Note that the acoustic characterization pattern may include pulses. Furthermore, the sound may be in a frequency range outside the human hearing, such as ultrasound.
Further, a given electronic device may output sound at a different time of one or more times than the time used by the remaining electronic devices, such that the sound from the given electronic device may be identified and distinguished from the sound output from the remaining electronic devices. Alternatively or additionally, the sound output by a given electronic device may correspond to a given acoustic characterization pattern, which may be different from the acoustic characterization pattern used by the remaining electronic devices. Thus, the acoustic characterization pattern may uniquely identify the electronic device.
The a/V hub may then calculate a current time offset between a clock in the electronic device and a clock in the a/V hub based on the measured sound, the one or more times when the electronic device outputs the sound, and the one or more acoustic characterization patterns (operation 812). For example, the a/V hub may correct the measurement sound based on an acoustic characteristic of the environment (e.g., an acoustic delay associated with at least one particular frequency or a predetermined (or dynamically determined) acoustic transfer function of the environment in at least one frequency band (e.g., 100Hz to 20,000Hz as non-limiting examples), and may compare the output time to a trigger output time or a predetermined output time. This may allow the a/V hub to determine the original output sound without spectral filtering or distortion associated with the environment, which may allow the a/V hub to more accurately determine the current time offset.
Note that measuring the sound may include: information specifying one or more times at which the electronic device outputs sound (e.g., each pulse in the acoustic characterization pattern may specify the time), and the one or more times may correspond to a clock of the electronic device. Alternatively or additionally, the a/V hub may optionally provide one or more times when the electronic device outputs the sound to the electronic device via wireless communication (operation 808), and the one or more times may correspond to a clock in the a/V hub. For example, the a/V hub may transmit one or more frames or packets having the one or more times to the electronic device. Thus, the a/V hub may trigger the output of the sound, or may output the sound at a predetermined output time.
Next, the a/V hub may transmit one or more frames or packets including the audio content and playback timing information to the electronic device using wireless communication (operation 814), wherein the playback timing information specifies a playback time at which the electronic device is to playback the audio content based on the current time offset. Furthermore, the playback times of the electronic devices have a temporal relationship, thereby coordinating the playback of audio content by the electronic devices. Note that the temporal relationship may have non-zero values, such that playback audio content having phases relative to each other is played back by instructing at least some electronic devices to use different playback time values. For example, the different playback times may be based on predetermined or dynamically determined acoustic characteristics of the environment including the electronic device and the a/V hub. Alternatively or additionally, the different playback times may be based on desired acoustic characteristics in the environment and/or an estimated location of the listener relative to the electronic device.
In some embodiments, the A/V hub optionally performs one or more additional operations (operation 816). For example, the a/V hub may modify the measurement sound based on an acoustic transfer function of the environment in at least one frequency band, the acoustic transfer function including spectral content in the acoustic characterization mode. Note that the acoustic transfer function may be predetermined and accessed by the a/V hub or dynamically determined by the a/V hub. Such an environment-dependent filtering correction is necessary because, although the time delay and dispersion associated with sound propagation in the environment may be much greater than the required coordination of the clock in the electronic device and the clock in the a/V hub, the leading edge of the modified direct sound can be determined with sufficient accuracy so that the current time offset between the clock in the electronic device and the clock in the a/V hub can be determined. For example, the desired accuracy of the coordination of the speakers 118 may be as little as a microsecond, while the propagation delay of sound in a typical room (e.g., a distance of up to 10 meters to 30 meters) may be five orders of magnitude greater. Nevertheless, the modified measurement sound may allow the leading edge of the direct sound associated with the pulse in the sound output of a given electronic device to be measured with as little as microsecond accuracy, which may facilitate coordination of the clock in the electronic device and the clock in the a/V hub. In some embodiments, the a/V hub determines the temperature in the environment and may correct the calculation of the current time offset for changes in temperature (which affect the speed of sound in the environment).
Fig. 9 is a diagram illustrating communications between portable electronic device 110, a/V hub 112, and speakers 118-1. In particular, the processor 910 in the speaker 118-1 may instruct 912 one or more acoustic sensors 914 in the speaker 118-1 to output sound at the output time, where the sound corresponds to the acoustic characterization pattern. For example, the output times may be predefined (e.g., based on a pattern or pulse sequence in the acoustic characterization pattern, a predetermined output schedule with predetermined output times, or a predetermined interval between output times), and thus, the output times may be known to a/V hub 112 and speaker 118-1. Alternatively, interface circuitry 916 in a/V hub 112 may provide trigger frame or packet 918. After the interface circuit 920 receives the trigger packet 918, it may forward the instruction 922 to the processor 910 in the speaker 118-1, which triggers the sound output from the one or more acoustic sensors 914 based on the instruction 922.
Subsequently, one or more acoustic sensors 924 in a/V hub 112 may measure 926 the sound, and may provide information 928 specifying the measurement to a processor 930 in a/V hub 112.
Next, processor 930 may calculate a current time offset 932 between a clock from a clock circuit (e.g., an interface clock circuit) in speaker 118-1 and a clock from a clock circuit (e.g., an interface clock circuit) in a/V hub 112 based on information 928, one or more times when speaker 118-1 outputs sound, and the acoustic characterization pattern associated with speaker 118-1. For example, when one or more acoustic wave sensors 914 in speaker 118-1 output sound corresponding to acoustic signature patterns, processor 930 may determine current time offset 932 based on at least two times in the acoustic signature patterns.
Further, the processor 930 may provide the playback timing information 934 and the audio content 936 to the interface circuit 916, where the playback timing information 934 specifies a playback time for the speaker 118-1 to playback the audio content 936 based on the current time offset 932. Note that the processor 930 may access the audio content 936 in the memory 938. In response, interface circuit 916 may send one or more frames or packets 940 including playback timing information 934 and audio content 936 to speaker 118-1. (however, in some embodiments, separate or distinct frames or packets are used to send playback timing information 934 and audio content 936.)
After interface circuit 920 receives one or more frames or packets 940, it may provide playback timing information 934 and audio content 936 to processor 924. Processor 924 may run software that performs playback operation 942. For example, the processor 924 may store the audio content 936 in a queue in memory. In these embodiments, the playback operations 942 include outputting audio content 936 from the queue, which includes driving one or more sound wave sensors 914 based on the audio content 936 so that the speaker 118-1 outputs sound at the time specified by the playback timing information 934.
In an exemplary embodiment, communication techniques are used to coordinate the playback of audio content by speakers 118. This is illustrated in fig. 10, which presents a diagram illustrating the playback of audio content by the coordinating speaker 118. In particular, the speaker 118 may output sound 1010 corresponding to the acoustic characterization pattern. For example, the acoustic characterization pattern associated with the speaker 118-1 may include two or more pulses 1012, where the time interval 1014 between each pulse 1012 may correspond to a clock provided by clock circuitry in the speaker 118-1. In some embodiments, the pattern or pulse sequence in the acoustic characterization pattern may also uniquely identify the speaker 118. While pulses 1012 are used to illustrate the acoustic characterization mode in fig. 10, in other embodiments, various time, frequency, and/or modulation techniques may be used, including: amplitude modulation, frequency modulation, phase modulation, and the like. Note that when the speaker 118 is to output sound 1010 corresponding to an acoustic characterization pattern, the a/V hub 112 may selectively trigger the output of the sound 1010 by sending one or more frames or packets 1016 of information 1018 having a specified time to the speaker 118.
a/V hub 112 may then use one or more acoustic wave sensors to measure sounds 1010 output by the electronic device, where the sounds correspond to one or more acoustic characterization patterns. After measuring sound 1010, a/V hub 112 may calculate a current time offset between the clock in speaker 118 and the clock in a/V hub 112 based on the measured sound 1010, one or more times when speaker 118 outputs the sound, and one or more acoustic characterization patterns. In some embodiments, a/V hub 112 may calculate the current time offset based on a model of clock drift in speakers 118 in a/V hub 112. For example, a model of relative or absolute clock drift may include a polynomial or cubic spline expression (more commonly a function) with parameters that specify or estimate clock drift in a given speaker as a function of time based on historical time offsets.
Next, a/V hub 112 may transmit one or more frames or packets to speakers 118 that include audio content 1022 and playback timing information (e.g., playback timing information 1024-1 in packet 1020-1) that specifies the playback time when the speaker 118 devices are to play back audio content 1022 based on the current time offset. The playback times of the speakers 118 may have a temporal relationship such that the playback of the audio content 1022 by the speakers 118 is coordinated, e.g., such that the associated sound or wave front (wavefront) reaches a location 1026 in the environment having a desired phase relationship.
The communication techniques may include operations for adapting coordination to improve the listener's acoustic experience. One method is illustrated in fig. 11, which presents a flow diagram illustrating a method 1100 for selectively determining one or more acoustic characteristics of an environment (e.g., a room). Method 1100 may be performed by an a/V hub, such as a/V hub 112 (fig. 1). During operation, an a/V hub (e.g., control circuitry or control logic such as a processor running program modules in the a/V hub) may optionally detect electronic devices in the environment using wireless communication (operation 1110). Alternatively or additionally, the a/V hub may determine a change condition (operation 1112), wherein the change condition includes: an electronic device has not been previously detected in the environment; and/or a change in the location of the electronic device (including a change in location that occurs long after the electronic device is first detected in the environment).
When a change condition is determined (operation 1112), the A/V hub may transition to a characterization mode (operation 1114). During the characterization mode, the A/V hub may: providing instructions to the electronic device (operation 1116) to play back the audio content at the specified playback time; determining one or more acoustic characteristics of the environment based on the acoustic measurements in the environment (operation 1118); and storing characterization information in the memory (operation 1120), wherein the characterization information includes one or more acoustic characterizations.
Further, the a/V hub may transmit one or more frames or packets including the additional audio content and playback timing information to the electronic device (operation 1122), where the playback timing information may specify a playback time when the electronic device is to play back the additional audio content based on the one or more acoustic representations.
In some embodiments, the A/V hub optionally performs one or more additional operations (operation 1124). For example, the a/V hub may calculate the location of the electronic device in the environment, e.g., based on wireless communication. Further, the characterization information may include an identifier of the electronic device, which the a/V hub may receive from the electronic device using wireless communication.
Further, the a/V hub may determine one or more acoustic characteristics based at least in part on acoustic measurements performed by other electronic devices. Thus, the a/V hub may communicate with other electronic devices in the environment using wireless communication and may receive acoustic measurements from the other electronic devices. In these embodiments, one or more acoustic characteristics may be determined based on the location of other electronic devices in the environment. Note that the A/V hub may: receiving a location of the other electronic device from the other electronic device; accessing a predetermined location of the other electronic device stored in the memory; and determining the location of the other electronic device, e.g., based on the wireless communication.
In some embodiments, the a/V hub includes one or more acoustic wave sensors, and the a/V hub performs acoustic measurements using the one or more acoustic wave sensors. Thus, one or more acoustic characteristics may be determined by the a/V hub alone or in combination with acoustic measurements performed by other electronic devices.
However, in some embodiments, instead of determining the one or more acoustic characteristics, the a/V hub receives the determined one or more acoustic characteristics from one of the other electronic devices.
While acoustic characterization may be fully automated based on changing conditions, in some embodiments, a user may manually initiate a characterization mode upon detecting a changing condition, or may manually approve a characterization mode. For example, the A/V hub may: receiving a user input; and transitioning to the characterization mode based on the user input.
Fig. 12 is a diagram illustrating communication between a/V hub 112 and speakers 118-1. In particular, the interface circuitry 1210 in the A/V hub 112 may detect the speaker 118-1 through wireless communication of the frame or packet 1212 with the interface circuitry 1214 in the speaker 118-1. Note that this communication may be unidirectional or bidirectional.
The interface circuit 1210 may provide information 1216 to the processor 1218. This information may indicate the presence of speaker 118-1 in the environment. Alternatively or additionally, information 1216 may specify the location of speaker 118-1.
Processor 1218 may then determine whether a change condition has occurred 1220. For example, processor 1218 may determine that speaker 118-1 is present in the environment when speaker 118-1 was not previously present, or may determine that the location of previously detected speaker 118-1 has changed.
When it is determined to change condition 1220, processor 1218 may transition to characterization mode 1222. During characterization mode 1222, processor 1218 may provide instructions 1224 to interface circuit 1210. In response, the interface circuitry 1210 may send the instructions 1224 to the interface circuitry 1214 in frames or packets 1226.
Upon receiving the packet 1226, the interface circuitry 1214 may provide instructions 1224 to the processor 1228, which then the processor 1228 instructs the one or more acoustic sensors 1230 to play back the audio content 1232 at the specified playback time. Note that the processor 1228 may access the audio content 1232 in the memory 1208, or the audio content 1232 may be included in the packet 1226. Next, one or more acoustic sensors 1234 in a/V hub 112 may perform acoustic measurements 1236 of sounds corresponding to the audio content 1232 output by one or more acoustic sensors 1230. Based on the acoustic measurements 1236 (and/or additional acoustic measurements received by the interface circuitry 1210 from other speakers), the processor 1218 may determine one or more acoustic characteristics 1238 of the environment, which are then stored in the memory 1240.
Further, the processor 1218 may provide playback timing information 1242 and audio content 1244 to the interface circuit 1210, where the playback timing information 1242 specifies a playback time when the speaker 118-1 is to play back the audio content 1244 based at least in part on the one or more acoustic wave sensors 1238. In response, the interface circuit 1210 can send one or more frames or packets 1246 including playback timing information 1242 and audio content 1244 to the speaker 118-1. (however, in some embodiments, separate or distinct frames or packets are used to transmit playback timing information 1242 and audio content 1244.)
After the interface circuit 1214 receives one or more frames or packets 1246, it may provide playback timing information 1242 and audio content 1244 to the processor 1228. Processor 1228 may run software that performs playback operation 1248. For example, the processor 1228 may store the audio content 1244 in a queue in memory. In these embodiments, the playback operations 1248 include outputting audio content 1244 from the queue, which includes driving one or more acoustic wave sensors 1230 based on the audio content 1244 so that the speaker 118-1 outputs sound at the time specified by the playback timing information 1242.
In an exemplary embodiment, the communication technology is used to selectively determine one or more acoustic characteristics of the environment (e.g., room) that includes the a/V hub 112 when a change is detected. Fig. 13 presents a diagram illustrating selective acoustic characterization of an environment including a speaker 118. In particular, A/V hub 112 may detect speaker 118-1 in the environment. For example, a/V hub 112 may detect speaker 118-1 based on wireless communication of one or more frames or packets 1310 with speaker 118-1. Note that the wireless communication may be unidirectional or bidirectional.
When a change condition is determined (e.g., when the presence of speaker 118-1 is first detected, i.e., when speaker 118-1 was not previously detected in the environment, and/or when a change in the location 1312 of previously detected speaker 118-1 in the environment), a/V hub 112 may transition to characterization mode. For example, a/V hub 112 may transition to characterization mode when an amplitude change (e.g., a change of 0.0085m, 0.017m, or 0.305m, as non-limiting examples) in the wavelength magnitude of the human auditory ceiling at location 1312 of speaker 118-1 is detected.
During characterization mode, a/V hub 112 may: provide instructions to speaker 118-1 in frames or packets 1314 to play back audio content (i.e., output sound 1316) at the specified playback time; determining one or more acoustic characteristics of the environment based on an acoustic measurement of the sound 1316 output by the speaker 118-1; and storing one or more acoustic characteristics (which may include location 1312 of speaker 118-1) in memory.
For example, the audio content may include: pseudo-random frequency patterns or white noise over a frequency range (e.g., between 100Hz to 10,000Hz or between 100Hz to 20,000Hz, or two or more sub-bands within the human audible range, such as 500Hz, 1000Hz, and 2000Hz as non-limiting examples), acoustic patterns having carrier frequencies that change over time within the frequency range, acoustic patterns having spectral content within the frequency range, and/or one or more types of music (e.g., harmony music, classical music, indoor music, opera, rock or pop music, etc.). In some embodiments, the audio content uniquely identifies the speaker 118-1, e.g., a particular temporal pattern, spectral content, and/or one or more frequency tones. Alternatively or additionally, A/V hub 112 may receive an identifier, e.g., an alphanumeric code, for speaker 118-1 via wireless communication with speaker 118-1.
However, in some embodiments, acoustic characterization is performed without the speaker 118-1 playing the audio content. For example, the acoustic characterization may be based on acoustic energy associated with a person's voice or by measuring percussion background noise in the environment for 1 to 2 minutes. Thus, in some embodiments, the acoustic characterization includes a passive characterization (rather than an active measurement while the audio content is playing).
Further, the acoustic properties may include: the acoustic spectral response of the environment over a range of frequencies (i.e., information specifying the amplitude response as a function of frequency); an acoustic transfer function or impulse response over a range of frequencies (i.e., information specifying amplitude and phase responses as a function of frequency); room resonance or low frequency room mode (which has nodes and anti-nodes as a function of position or location in the environment and which can be determined by measuring sound in the environment in different directions of 90 ° to each other); location 1312 of speaker 118-1; reflections (including early reflections within 50ms to 60ms of arrival of direct sound from the speaker 118-1, and late reflections or echoes that occur on longer time scales, which may affect clarity); acoustic delay of direct sound; the average reverberation time in the frequency range (or the continued presence of acoustic sound in the frequency range after the audio content has stopped); the volume of the environment (e.g., the size and/or geometry of the room, which may be optically determined); background noise in the environment; ambient sounds in the environment; the temperature of the environment; the number of people in the environment (more generally, absorption or acoustic losses in the frequency range in the environment); a measure of how lively, bright, or dim the environment acoustically; and/or information specifying the type of environment (e.g., auditorium, general room, concert hall, size of room, type of furniture in room, etc.). For example, the reverberation time may be defined as the time for the sound pressure associated with the pulse at the frequency to decay to a certain level (e.g., -60 dB). In some embodiments, the reverberation time is a function of frequency. Note that the frequency ranges in the foregoing examples of acoustic characteristics may be the same as or different from each other. Thus, in some embodiments, different frequency ranges may be used for different acoustic characteristics. Additionally, note that the "acoustic transfer function" in some embodiments may include the magnitude of the acoustic transfer function (which is sometimes referred to as the "acoustic spectral response"), the phase of the acoustic transfer function, or both.
As previously described, the acoustic characteristic may include the location 1312 of the speaker 118-1. Location 1312 of speaker 118-1 (including distance and direction) may be determined by a/V hub 112 and/or in conjunction with other electronic devices in the environment (e.g., speaker 118) using the following techniques: triangulation, trilateration, time of flight, wireless ranging, angle of arrival, and the like. Further, location 1312 may be determined by a/V hub 112 using: wireless communication (e.g., communication with a wireless local area network or with a cellular telephone network), acoustic measurements, local positioning systems, global positioning systems, and the like.
Although the acoustic characteristics may be determined by a/V hub 112 based on measurements performed by a/V hub 112, in some embodiments, the acoustic characteristics are determined by or in conjunction with other electronic devices in the environment. In particular, one or more other electronic devices (e.g., one or more other speakers 118) may perform the acoustic measurements and then wirelessly transmit the acoustic measurements in frames or packets 1318 to the a/V hub 112. (accordingly, acoustic sensors that perform acoustic measurements may be included in a/V hub 112 and/or in one or more other speakers 118.) thus, a/V hub 112 may calculate acoustic characteristics based, at least in part, on acoustic measurements performed by a/V hub and/or one or more other speakers 118. Note that the calculation may also be based on the location 1320 of one or more other speakers 118 in the environment. These positions may be: the location is received from one or more other speakers 118 in a frame or packet 1318, calculated using one of the techniques previously described (e.g., using wireless ranging), and/or accessed in memory (i.e., location 1320 may be predetermined).
Further, while acoustic characterization may occur when a changing condition is detected, alternatively or additionally, a/V hub 112 may transition to a characterization mode based on user input. For example, the user may activate a virtual command icon in a user interface on the portable electronic device 110. Thus, acoustic characterization may be initiated automatically, manually, and/or semi-automatically (where the user interface is used to query the user for approval prior to transitioning to the characterization mode).
After determining the acoustic characteristics, the a/V hub 112 may transition back to the normal operating mode. In this mode of operation, the a/V hub 112 may send one or more frames or packets (e.g., packets 1322) including the additional audio content 1324 (e.g., music) and the playback timing information 1326 to the speaker 118-1, where the playback timing information 1326 may specify a playback time when the speaker 118-1 is to play back the additional audio content 1324 based on one or more acoustic characteristics. Accordingly, the acoustic characterization may be used to correct for or adapt to changes (directly or indirectly) in one or more acoustic characteristics associated with the change in the position 1312 of speaker 118-1, thereby improving the user experience.
Another approach for improving the acoustic experience is to adapt the coordination based on the dynamically tracked locations of one or more listeners. This is illustrated in fig. 14, which presents a flow chart illustrating a method 1400 for calculating an estimated location. Note that method 1400 may be performed by an A/V hub (e.g., A/V hub 112 (FIG. 1)). During operation, an a/V hub (e.g., control circuitry or control logic such as a processor running a program module in the a/V hub) may calculate an estimated location of a listener (alternatively, an electronic device associated with the listener, e.g., portable electronic device 110 in fig. 1) relative to electronic devices in an environment that includes the a/V hub and the electronic devices (operation 1410).
The a/V hub may then transmit one or more frames or packets including the audio content and playback timing information to the electronic device (operation 1412), where the playback timing information specifies a playback time when the electronic device is to playback the audio content based on the estimated location. Furthermore, the playback times of the electronic devices have a temporal relationship, thereby coordinating the playback of audio content by the electronic devices. Note that the temporal relationship may have a non-zero value, thereby instructing at least some electronic devices to playback audio content having a phase relative to each other by using different playback time values. For example, the different playback times may be based on predetermined or dynamically determined acoustic characteristics of the environment including the electronic device and the a/V hub. Alternatively or additionally, the different playback times may be based on desired acoustic characteristics in the environment. Additionally, the playback time may be based on a current time offset between a clock in the electronic device and a clock in the a/V hub.
In some embodiments, the A/V hub optionally performs one or more additional operations (operation 1414). For example, the a/V hub may communicate with another electronic device and may calculate an estimated location of the listener based on the communication with the other electronic device.
Further, the a/V hub may include acoustic sensors that perform sound measurements in the environment, and may calculate an estimated location of the listener based on the sound measurements. Alternatively or additionally, the a/V hub may communicate with other electronic devices in the environment and may receive additional sound measurements of the environment from the other electronic devices and may calculate the estimated location of the listener based on the additional sound measurements.
In some embodiments, the a/V hub performs time-of-flight measurements and calculates an estimated location of the listener based on the time-of-flight measurements.
Further, the a/V hub may calculate additional estimated locations of additional listeners relative to the electronic devices in the environment, and the playback time may be based on the estimated locations and the additional estimated locations. For example, the playback time may be based on an average of the estimated position and the additional estimated position. Alternatively, the playback time may be based on a weighted average of the estimated position and the additional estimated position.
Fig. 15 is a diagram illustrating communication between portable electronic device 110, a/V hub 112, and speakers 118 (e.g., speaker 118-1). In particular, interface circuit 1510 in a/V hub 112 may receive one or more frames or packets 1512 from interface circuit 1514 in portable electronic device 110. Note that communication between the a/V hub 112 and the portable electronic device 110 may be unidirectional or bidirectional. Interface circuit 1510 and/or processor 1516 in a/V hub 112 may then estimate a location 1518 of a listener associated with portable electronic device 110 based on one or more frames or packets 1512. For example, interface circuit 1510 may provide information 1508 based on packet 1512, which information 1508 is used by processor 1516 to estimate location 1518.
Alternatively or additionally, one or more acoustic sensors 1520 in a/V hub 112 and/or one or more acoustic sensors 1506 in speakers 118 may perform measurements 1522 of sounds associated with a listener. If the speaker 118 performs the measurement 1522-2 of sound, interface circuitry 1524 in one or more speakers 118 (e.g., speaker 118-1) may send one or more frames or packets 1526 (with information 1528 specifying the measurement 1522-2 of sound) to interface circuitry 1510 based on instructions 1530 from processor 1532. The interface circuit 1514 and/or processor 1516 may then estimate a position 1518 based on the measured sound 1522.
Next, processor 1516 may instruct interface circuit 1510 to send one or more frames or packets 1536 having playback timing information 1538 and audio content 1540 to speaker 118-1, where playback timing information 1538 specifies a playback time when speaker 118-1 is to play back audio content 1540 based at least in part on location 1518. (however, in some embodiments, separate or distinct frames or packets are used to send playback timing information 1538 and audio content 1540.) note that processor 1516 may access audio content 1540 in memory 1534.
After receiving the one or more frames or packets 1536, the interface circuit 1524 may provide playback timing information 1538 and audio content 1540 to the processor 1532. The processor 1532 may run software that performs the playback operation 1542. For example, the processor 1532 may store the audio content 1540 in a queue in memory. In these embodiments, the playback operation 1542 includes outputting audio content 1540 from the queue, which includes driving one or more sound wave sensors 1506 based on the audio content 1540, so that speaker 118-1 outputs sound at a time specified by the playback timing information 1538.
In an exemplary embodiment, communication techniques are used to dynamically track the location of one or more listeners in an environment. Fig. 16 presents a diagram illustrating the calculation of the estimated position of one or more listeners relative to the speaker 118. In particular, the a/V hub 112 may calculate an estimated location of one or more listeners, for example, a location 1610 of a listener 1612 relative to a speaker 118 in an environment such as the environment that includes the a/V hub 112 and the speaker 118. For example, the location 1610 may be determined coarsely (e.g., to the nearest room, 3-to-10-meter accuracy, etc.) or finely (e.g., 0.1-to-3-meter accuracy), which is a non-limiting numerical example.
In general, the location 1610 may be determined by the a/V hub 112 and/or in conjunction with other electronic devices (e.g., speakers 118) in the environment using the following techniques: triangulation, trilateration, time of flight, wireless ranging, angle of arrival, and the like. Further, location 1610 may be determined by a/V hub 112 using: wireless communication (e.g., communication with a wireless local area network or with a cellular telephone network), acoustic measurements, local positioning systems, global positioning systems, and the like.
For example, the location 1610 of at least one listener 1612 may be estimated by the a/V hub 112 based on wireless communication (e.g., using wireless ranging, time-of-flight measurements, angle-of-arrival, RSSI, etc.) of one or more frames or packets 1614 with another electronic device (e.g., the portable electronic device 110, which may be beside or on their person) of the listener 1612. In some embodiments, wireless communications with other electronic devices (e.g., MAC addresses in frames or packets received from the portable electronic device 110) are used as signatures or electronic fingerprints to identify the listener 1612. Note that communication between the portable electronic device 110 and the a/V hub 112 may be unidirectional or bidirectional.
During wireless ranging, a/V hub 112 may transmit a frame or packet including a transmission time to, for example, portable electronic device 110. When the portable electronic device 110 receives the frame or packet, the time of arrival may be determined. Based on the product of the time of flight (difference in arrival time and transmission time) and the propagation velocity, the distance between the a/V hub 112 and the portable electronic device 110 may be calculated. This distance may then be transmitted along with the identifier of the portable electronic device 110 in a subsequent transfer of frames or packets from the portable electronic device 110 to the a/V hub 112. Alternatively, portable electronic device 110 may transmit a frame or packet that includes the time of transmission and an identifier of portable electronic device 110, and a/V hub 112 may determine the distance between portable electronic device 110 and a/V hub 112 based on the product of the time of flight (the difference in time of arrival and time of transmission) and the propagation velocity.
In a variation of this approach, a/V hub 112 may transmit a frame or packet 1614 reflected at portable electronic device 110, and the reflected frame or packet 1614 may be used to dynamically determine the distance between portable electronic device 110 and a/V hub 112.
While the foregoing examples illustrate wireless ranging of coordinated clocks in the portable electronic device 110 and the a/V hub 112, in other embodiments, the clocks are not coordinated. For example, the location of the portable electronic device 110 may be estimated based on propagation velocity and time of arrival data (which is sometimes referred to as "differential time of arrival") of wireless signals at several receivers at different known locations in the environment, even if the time of transmission is unknown or unavailable. For example, the receivers may be at least some of the other speakers 118 at location 1616, and location 1616 may be predetermined or predetermined. More generally, various radiolocation techniques may be used, such as: determining a distance based on a difference in power of the RSSI relative to the original transmitted signal strength (which may include corrections for absorption, refraction, shadowing, and/or reflection); determining an angle of arrival at a receiver (including non-line-of-sight reception) using a directional antenna or based on differential arrival times of an antenna array having known locations in the environment; determining a distance based on the backscattered wireless signal; and/or determining angles of arrival at two receivers having known locations in the environment (i.e., trilateration or multilateration). Note that the wireless signal may include transmissions over a GHz bandwidth or multi-GHz bandwidth to produce a pulse of short duration (e.g., about 1ns), which may allow the distance to be determined within 0.305m (e.g., 1ft), all non-limiting examples. In some embodiments, wireless ranging is facilitated using location information (e.g., the location of one or more electronic devices in an environment determined or specified by a local positioning system, a global positioning system, and/or a wireless network, e.g., location 1616).
Alternatively or additionally, a/V hub 112 may estimate location 1610 based on sound measurements in the environment (e.g., acoustic tracking of listener 1612), for example, based on sounds 1618 emitted by listener 1612 as it moves, talks, and/or breathes. The acoustic measurements may be performed by the a/V hub 112 (e.g., using two or more acoustic wave sensors, such as microphones, which may be arranged as a phased array). However, in some embodiments, the sound measurements may be performed separately or additionally by one or more electronic devices in the environment (e.g., speakers 118), and these sound measurements may be wirelessly transmitted in frames or packets 1618 to a/V hub 112, which in turn uses the sound measurements to estimate the location 1610. In some embodiments, the listener 1612 is identified using speech recognition techniques.
In some embodiments, the location 1610 is estimated by the a/V hub 112 based on sound measurements in the environment and predetermined acoustic characteristics of the environment (e.g., spectral response or acoustic transfer function). For example, as the listener 1612 moves through the environment, changes in the excitation of the predetermined room pattern can be used to estimate the location 1610.
Further, the location 1610 of the listener 1612 can be tracked or estimated using one or more other techniques. For example, the location 1610 may be estimated (thus, roughly determining the location 1610) based on optical imaging (e.g., visible or infrared) of the listener 1612 in a wavelength band, time-of-flight measurements (e.g., laser ranging), and/or a beam (e.g., infrared beam) grid that positions the listener 1612 in the grid based on a pattern of beam crossings. In some embodiments, the identity of the listener 1612 is determined in an optical image using facial recognition and/or door recognition techniques.
For example, in some embodiments, the listener's location in the environment is tracked based on wireless communication with a cellular telephone carried by the listener. Based on the pattern of locations in the environment, the location of furniture in the environment and/or the geometry of the environment (e.g., the size or dimensions of a room) may be determined. This information can be used to determine the acoustic properties of the environment. Further, the historical position of the listener can be used to constrain the estimated position of the listener in the environment. In particular, historical information about the listener's location in different time of day environments may be used to help estimate the current location of the listener at a particular time of day. Thus, in general, the location of the listener may be estimated using a combination of optical measurements, acoustic characteristics, wireless communication, and/or machine learning.
After determining the location 1610, the a/V hub 112 may transmit to the speaker 118 at least one or more frames or packets that include additional audio content 1622 (e.g., music) and playback timing information (e.g., the playback timing information 1624-1 in the packet 1620-1 is transmitted to the speaker 118-1), where the playback timing information 1624-1 may specify a playback time at which the speaker 118-1 is to play back the additional audio content 1622 based on the location 1610. Thus, communication techniques may be used to correct or adapt for changes in the location 1610, thereby improving the user experience.
As previously mentioned, the different playback times may be based on desired acoustic characteristics in the environment. For example, the desired acoustic characteristics may include a type of playback, such as: mono sound, stereo sound, and/or multi-channel sound. The mono sound may comprise one or more audio signals that do not contain amplitude (or level) and time/phase of arrival information of replicated or analog directional cues (directional sounds).
Further, stereo sound may include two independent audio signal channels, and the audio signals may have a particular amplitude and phase relationship to each other such that during playback operations there is a distinct image of the original sound source. In general, two channels of audio signals may provide coverage for most or all of the environment. By adjusting the relative amplitudes and/or phases of the audio channels, the optimal listening point can be moved to follow at least the determined position of the listener. However, the amplitude difference and the arrival time difference (direction cues) may need to be small enough so that both the stereo image and the positioning are maintained. Otherwise, the image may collapse and only one or the other audio channel can be heard.
Note that the audio channels in stereo sound may need to have the correct absolute phase response. This means that an audio signal having a positive pressure waveform at the input of the system may need to have the same positive pressure waveform at the output from one of the speakers 118. Thus, a drum that produces a positive pressure waveform at the microphone when struck may need to produce a positive pressure waveform in the environment. Alternatively, the audio image may be unstable if the absolute polarity is reversed in the wrong manner. In particular, the listener may not find or perceive a stable audio image. Instead, the audio image may drift (wander) and may be positioned at the speaker 118.
Further, the multi-channel sound may include a left audio channel, a middle audio channel, and a right audio channel. For example, these channels may allow for monophonic speech enhancement, and music or sound effect cues to be localized, or mixed with imaging for a particular viewing angle, stereo, or similar stereo. Thus, the three audio channels may provide most or all of the coverage of the entire environment while preserving the amplitude and direction cues, as is the case with mono or stereo sound.
Alternatively or additionally, the desired acoustic characteristics may include acoustic radiation patterns. The desired acoustic radiation pattern may be a function of reverberation time in the environment. For example, the reverberation time may vary depending on the number of people in the environment, the type and number of furniture in the environment, whether curtains are open or closed, whether windows are open or closed, and the like. When the reverberation time is longer or increased, the desired acoustic radiation pattern may be more directional, thereby steering or conveying the sound to the listener (thereby reducing reverberation). In some embodiments, the desired acoustic characteristic comprises intelligibility of the word.
While the foregoing discussion shows techniques that may be used to dynamically track the location 1610 of the listener 1612 (or the portable electronic device 110), these techniques may be used to determine the location of an electronic device (e.g., the speaker 118-1) in an environment.
Another approach for improving the acoustic experience is to dynamically aggregate electronic devices into groups and/or adapt coordination based on the groups. This is illustrated in fig. 17, which presents a flow chart illustrating a method 1700 for aggregating electronic devices. Note that method 1700 may be performed by an A/V hub, such as A/V hub 112 (FIG. 1). During operation, an a/V hub (e.g., control circuitry or control logic in the a/V hub, e.g., a processor executing program modules) may measure sound output by an electronic device (e.g., speaker 118) in the environment using one or more sound wave sensors (operation 1710), where the sound corresponds to audio content. For example, the measurement sound may include a sound pressure.
The a/V hub may then aggregate the electronic devices into two or more subsets based on the measured sounds (operation 1712). Note that different subsets may be located in different rooms in the environment. Further, at least one subset may play back audio content that is different from the rest of the subset. Further, the electronic devices may be aggregated into two or more subsets based on: different audio content; measuring an acoustic delay of the sound; and/or desired acoustic properties in the environment. In some embodiments, the electronic devices in the subset and/or the geographic location or area associated with the subset are not predetermined. Instead, the a/V hub may dynamically aggregate the subsets.
Further, the a/V hub may determine playback timing information for the subset (operation 1714), where the playback timing information specifies a playback time when the electronic devices in the given subset are to play back the audio content.
Next, the a/V hub may transmit one or more frames or packets including the audio content and the playback timing information to the electronic devices using wireless communication (operation 1716), wherein at least the playback times of the electronic devices of the given subset have a temporal relationship to coordinate the playback of the audio content by the electronic devices of the given subset.
In some embodiments, the A/V hub optionally performs one or more additional operations (operation 1718). For example, the a/V hub may calculate an estimated location of at least one listener relative to the electronic devices, and may aggregate the electronic devices into two or more subsets based on at least the estimated location of the listener. This may help ensure that the listener has an improved acoustic experience while reducing acoustic crosstalk from other subsets.
Further, the A/V hub may modify the measurement sound based on a predetermined (or dynamically determined) acoustic transfer function of the environment in at least one frequency band (e.g., 100Hz-20,000Hz, which is a non-limiting example). This may allow the a/V hub to determine the original output sound without spectral filtering or distortion associated with the environment, which may allow the a/V hub to make better decisions when aggregating subsets.
Further, the a/V hub may determine a playback volume for the subset for use in playing back the audio content, and one or more frames or packets may include information specifying the playback volume. For example, the playback volume of at least one subset may be different from the playback volume of the remaining subsets. Alternatively or additionally, the playback volume may reduce acoustic crosstalk between two or more subsets such that the listener is more likely to hear the sound output of the subset they are close or closest to.
Fig. 18 is a diagram illustrating communications between the portable electronic device 110, the a/V hub 112, and the speakers 118. In particular, the processor 1810 may instruct 1812 one or more acoustic wave sensors 1814 in the a/V hub 112 to perform measurements 1816 of sound associated with the speaker 118. Processor 1810 may then group speakers 118 into two or more subsets 1818 based on measurements 1816.
Further, the processor 1810 can determine playback timing information 1820 for the subset 1818, wherein the playback timing information 1820 specifies a playback time at which the speakers 118 in the given subset will play back the audio content 1822. Note that the processor 1810 can access audio content 1822 in the memory 1824.
Next, the processor 1810 can instruct the interface circuit 1826 to send a frame or packet 1828 with the playback timing information 1820 and the audio content 1822 to the speaker 118. (however, in some embodiments, separate or distinct frames or packets are used to transmit playback timing information 1820 and audio content 1822.)
After receiving one or more frames or packets 1826, interface circuitry in the speaker 118-3 may provide playback timing information 1820 and audio content 1822 to the processor. The processor may run software that performs playback operation 1830. For example, the processor may store the audio content 1822 in a queue in memory. In these embodiments, playback operations 1830 include outputting audio content 1822 from the queue, which includes driving one or more acoustic wave sensors based on the audio content 1822 such that the speaker 118-3 outputs sound at a time specified by the playback timing information 1820. Note that at least the playback times of the speakers 118 in a given subset have a temporal relationship in order to coordinate the playback of the audio content 1822 by the speakers 118 in the given subset.
In an exemplary embodiment, communication techniques are used to aggregate speakers 118 into subsets. Fig. 19 presents a diagram showing speakers 118 aggregated in an environment, which may be in the same room or in different rooms. The a/V hub 112 may measure the sound 1910 output by the speaker 118. Based on these measurements, the a/V hub 112 may aggregate the speakers 118 into a subset 1912. For example, the subset 1912 may be aggregated based on sound intensity and/or acoustic delay such that neighboring speakers are aggregated together. In particular, the speakers with the highest acoustic intensity or similar acoustic delay may be grouped together. To facilitate aggregation, the speaker 118 may wirelessly transmit and/or acoustically output identifying information or acoustic characterization patterns outside of the human hearing range. For example, the acoustic characterization pattern may include a pulse. However, various time, frequency and/or modulation techniques may be used, including: amplitude modulation, frequency modulation, phase modulation, and the like. Alternatively or additionally, a/V hub 112 may instruct each speaker 118 to noise process (diter) the playback time or phase of its output sound one at a time so that a/V hub 112 may associate the measured sound with a particular speaker. In addition, the measured sound 1910 may be corrected using the acoustic transfer function of the environment to cancel the effects of reflections and filtering (or distortion) before the speaker 118 is converged. In some embodiments, the speakers 118 are aggregated based at least in part on the locations 1914 of the speakers 118, the locations 1914 of the speakers 118 may be determined using one or more of the aforementioned techniques (e.g., using wireless ranging). In this manner, the subset 1912 may be dynamically modified as one or more listeners reposition the speakers 118 in the environment.
The a/V hub 112 may then transmit one or more frames or packets (e.g., packet 1916) including the additional audio content 1918 (e.g., music) and playback timing information 1920 to the speakers 118 in at least one of the subsets 1912 (e.g., subset 1912-1), where the playback timing information 1920 may specify a playback time when the speakers 118 in the subset 1912-1 are to play back the additional audio content 1918. Thus, communication techniques may be used to dynamically select the subset 1912 based on, for example, the location of the listener and/or desired acoustic characteristics in the environment including the a/V hub 112 and speakers 118.
Another approach for improving the acoustic experience is to dynamically equalize the audio based on acoustic monitoring in the environment. Fig. 20 presents a flowchart illustrating a method 2000 for determining equalized audio content, which may be performed by an a/V hub, such as a/V hub 112 (fig. 1). During operation, an a/V hub (e.g., control circuitry or control logic in the a/V hub, e.g., a processor executing program modules) may measure sounds output by electronic devices in the environment (e.g., speakers 118) using one or more acoustic wave sensors (operation 2010), where the sounds correspond to audio content. For example, the measurement sound may include a sound pressure.
The A/V hub may then compare the measured sound to a desired acoustic characteristic at a first location in the environment based on the first location in the environment, a second location of the A/V hub, and a predetermined or dynamically determined acoustic transfer function of the environment in at least one frequency band (e.g., 100kHz-20,000kHz, as a non-limiting example) (operation 2012). Note that the comparison may be performed in the time domain and/or the frequency domain. To perform the comparison, the a/V hub may calculate acoustic characteristics (e.g., acoustic transfer functions or modal responses) at the first location and/or the second location, and may use the calculated acoustic characteristics to correct the measured sound for filtering or distortion in the environment. Using acoustic transfer functions as an example, the calculation may involve using green's function techniques to calculate the acoustic response of the environment as a function of location, where one or more points or distributed sound sources are located at predefined or known locations in the environment. Note that the acoustic transfer function and correction at the first location may depend on the integrated acoustic behavior of the environment (and thus, the second location and/or locations of the sound source (e.g., speaker 118) in the environment). Thus, the acoustic transfer function may include the following information: the information specifies a location in the environment where the acoustic transfer function is determined (e.g., the second location) and/or a location of a sound source in the environment (e.g., a location of the at least one electronic device).
Further, the a/V hub may determine equalized audio content based on the comparison and the audio content (operation 2014). Note that the desired acoustic characteristics may be based on the type of audio playback, for example: mono, stereo and/or multi-channel. Alternatively or additionally, the desired acoustic characteristics may include acoustic radiation patterns. The desired acoustic radiation pattern may be a function of reverberation time in the environment. For example, the reverberation time may vary depending on the number of people in the environment, the type and number of furniture in the environment, whether curtains are open or closed, whether windows are open or closed, and the like. When the reverberation time is longer or increased, the desired acoustic radiation pattern may be more directional such that sound associated with the equalized audio content is directed or conveyed to the listener (thereby reducing reverberation). Thus, in some embodiments, equalization is a complex function that modifies amplitude and/or phase in the audio content. Further, the desired acoustic properties may include: room resonance or room modes are reduced by reducing the energy in the relevant low frequencies in the acoustic content. Note that in some embodiments, the desired acoustic characteristic comprises intelligibility of the word. Thus, the target (desired acoustic characteristics) may be used to adjust the equalization of the audio content.
Next, the a/V hub may transmit one or more frames or packets including the equalized audio content to the electronic device using wireless communication (operation 2016) to facilitate the electronic device to output additional sound corresponding to the equalized audio content.
In some embodiments, the a/V hub optionally performs one or more additional operations (operation 2018). For example, the first location may include an estimated location of the listener relative to the electronic device, and the a/V hub may calculate the estimated location of the listener. In particular, the estimated location of the listener may dynamically determine the location of the listener using one or more of the aforementioned techniques. Thus, the a/V hub may calculate an estimated location of the listener based on the sound measurements. Alternatively or additionally, the a/V hub may: communicating with another electronic device; and an estimated location of the listener may be calculated based on the communication with the other electronic device. In some embodiments, the communication with the other electronic device includes wireless ranging, and the estimated position may be calculated based on the wireless ranging and angles of arrival of wireless signals from the other electronic device. Further, the a/V hub may perform time-of-flight measurements and may calculate an estimated location of the listener based on the time-of-flight measurements. In some embodiments, dynamic equalization allows for the "best listening point" in the environment to be adjusted based on the location of the listener. Note that the a/V hub may determine the number of listeners and/or the location of the listeners in the environment, and the dynamic equalization may adjust the sound so that the listener (or most listeners) have the desired acoustic characteristics when listening to the equalized audio content.
Further, the a/V hub may communicate with other electronic devices in the environment and may receive additional sound measurements of the environment (either independent of or in combination with the sound measurements) from the other electronic devices. The a/V hub may then perform one or more additional comparisons of the additional sound measurements with the desired acoustic characteristics at the first location in the environment based on one or more third locations of the other electronic devices (e.g., locations of speakers 118) and the predetermined or dynamically determined acoustic transfer function of the environment in the at least one frequency band, and further determine equalized audio content based on the one or more additional comparisons. In some embodiments, the a/V hub determines one or more third locations based on communications with other electronic devices. For example, the communication with the other electronic devices may include wireless ranging, and the one or more third locations may be calculated based on the wireless ranging and the angle of arrival of wireless signals from the other electronic devices. Alternatively or additionally, the a/V hub may receive information from the other electronic device specifying the third location. Accordingly, the location of other electronic devices may be determined using one or more of the above-described techniques for determining the location of electronic devices in an environment.
Further, the a/V hub may determine playback timing information that specifies a playback time when the electronic device plays back the equalized audio content, and the one or more frames or packets may include the playback timing information. In these embodiments, the playback times of the electronic devices have a temporal relationship to coordinate the playback of the audio content by the electronic devices.
Fig. 21 is a diagram illustrating communications between the portable electronic device 110, the a/V hub 112, and the speakers 118. In particular, the processor 2110 may instruct one or more acoustic sensors 2114 in the 2112A/V hub 112 to measure sound 2116 associated with the speaker 118 and corresponding to the audio content 2118. The processor 21110 may then compare 2120 the measured sound 2116 with an expected acoustic characteristic 2122 at a first location in the environment based on the first location in the environment, the second location of the a/V hub 112, and a predetermined or dynamically determined acoustic transfer function 2124 of the environment in at least one frequency band (which is accessible in memory 2128).
Further, the processor 2110 can determine equalized audio content 2126 based on the comparison 2120 and the audio content 2118, the audio content 2126 being accessible in the memory 2128. Note that the processor 2110 may know in advance that the speaker 118 is outputting audio content 2118.
Next, the processor 2110 can determine playback timing information 2130, where the playback timing information 2130 specifies a playback time when the speaker 118 is to play back the equalized audio content 2126.
Further, the processor 2110 can instruct the interface circuit 2132 to send one or more frames or packets 2134 with playback timing information 2130 and equalized audio content 2126 to the speaker 118. (however, in some embodiments, playback timing information 2130 and audio content 2126 are sent using separate or distinct frames or packets.)
After receiving the one or more frames or packets 2134, interface circuitry in one of the speakers 118 (e.g., speaker 118-1) may provide playback timing information 2130 and equalized audio content 2126 to the processor. The processor may run software that performs playback operations. For example, the processor may store the equalized audio content 2126 in a queue in memory. In these embodiments, the playback operation includes outputting equalized audio content 2126 from the queue, which includes driving one or more acoustic wave sensors based on the equalized audio content 2126 such that speaker 118-1 outputs sound at a time specified by playback timing information 2130. The playback times of the speakers 118 have a temporal relationship to coordinate the playback of the equalized audio content 2126 by the speakers 118.
In an exemplary embodiment, communication techniques are used to dynamically equalize audio content. Fig. 22 presents a diagram illustrating the determination of equalized audio content using speakers 118. In particular, a/V hub 112 may measure sound 2210 corresponding to the audio content output by speakers 118. Alternatively or additionally, the portable electronic device 110 and/or at least some of the speakers 118 may measure the sound 2210 and may provide information specifying the measurement to the a/V hub 112 in frames or packets 2212.
The a/V hub 112 may then compare the measured sound 2210 with desired acoustic characteristics at a location 2214 in the environment (e.g., a dynamic location of one or more listeners, which may also be the location of the portable electronic device 110) based on the location 2214, the location 2216 of the a/V hub 112, the location 2218 of the speaker 118, and/or a predetermined or dynamically determined acoustic transfer function (or more generally, acoustic characteristics) of the environment in the at least one frequency band. For example, a/V hub 112 may calculate an acoustic transfer function at location 2214, 2216, and/or 2218. As previously described, this calculation may involve using green's function techniques to calculate the acoustic response at location 2214, 2216, and/or 2218. Alternatively or additionally, the computation may involve interpolation (e.g., minimum bandwidth interpolation) of predetermined acoustic transfer functions at different locations in the environment (i.e., locations 2214, 2216, and/or 2218). The a/V hub 112 may then correct the measurement sound 2210 based on the calculated and/or interpolated acoustic transfer function (and, more generally, acoustic properties).
In this way, communication techniques may be used to compensate for sparse sampling when initially determining the acoustic transfer function.
Further, a/V hub 112 may determine equalized audio content based on the comparison and the audio content. For example, the A/V hub 112 may modify the spectral content and/or phase of the audio content based on frequencies in a frequency range (e.g., 100-10,000 or 20,000Hz) to achieve desired acoustic characteristics.
Next, a/V hub 112 may transmit one or more frames or packets (e.g., packet 2220 with equalized audio content and playback timing information 2224) including equalized audio content (e.g., music) and playback timing information to speakers 118), where the playback timing information may specify a playback time when speakers 118 are to play back the equalized audio content.
In this manner, the communication techniques may allow the sound output by the speaker 118 to adapt to variations in the location 2214 of one or more listeners (e.g., an average or median location, locations corresponding to a majority of listeners, an average location of a maximum subset of listeners that may achieve desired acoustic characteristics for a given audio content and environment, acoustic transfer functions or acoustic characteristics, etc.). This may allow for optimal audio point tracking in stereo to the motion of one or more listeners in the environment and/or variations in the number of listeners (which may be determined by the a/V hub 112 using one or more of the techniques described above). Alternatively or additionally, the communication techniques may allow sound output by the speaker 118 to adapt to changes in the audio content and/or desired acoustic characteristics. For example, depending on the type of audio content (e.g., type of music), one or more listeners may want or desire a large or wide sound (with divergent sound waves corresponding to a clearly physically spread sound source) or a clearly narrow source or point source. Accordingly, the communication techniques may allow for equalization of audio content according to a desired psychoacoustic experience of one or more listeners. Note that the desired acoustic characteristics or desired psychoacoustic experience may be explicitly specified by one or more listeners (e.g., by using a user interface on the portable electronic device 110), or may be indirectly determined or inferred without user action (e.g., based on the type of music or previous acoustic preferences of one or more listeners stored in a listening history).
In some embodiments of method 200 (fig. 2), method 500 (fig. 5), method 800 (fig. 8), method 1100 (fig. 11), method 1400 (fig. 14), method 1700 (fig. 17), and/or method 2000 (fig. 20), there are more or fewer operations. In addition, the order of the operations may be changed, and/or two or more operations may be combined into a single operation. Further, one or more operations may be modified.
We now describe embodiments of an electronic device. FIG. 23 presents a block diagram illustrating an electronic device 2300, such as one of the portable electronic devices 110, the A/V hub 112, the A/V display device 114, the receiver device 116, or the speaker 118 of FIG. 1. The electronic device includes a processing subsystem 2310, a memory subsystem 2312, a network subsystem 2314, an optional feedback subsystem 2334, and an optional monitoring subsystem 2336. The processing subsystems 2310 include one or more devices configured to perform computing operations. For example, the processing subsystems 2310 may include one or more microprocessors, Application Specific Integrated Circuits (ASICs), microcontrollers, programmable logic devices, and/or one or more Digital Signal Processors (DSPs). One or more of these components in the processing subsystem are sometimes referred to as "control circuitry". In some embodiments, the processing subsystem 2310 includes a "control mechanism" or "processing means" that performs at least some operations in the communication technology.
The memory subsystem 2312 includes one or more means for storing data and/or instructions for the processing subsystem 2310 and the network subsystem 2314. For example, the memory subsystem 2312 may include Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), and/or other types of memory. In some embodiments, instructions in memory subsystem 2312 for processing subsystem 2310 include: one or more program modules or sets of instructions (e.g., program modules 2322 or operating system 2324) that are executable by processing subsystem 2310. Note that one or more computer programs or program modules may constitute a computer program mechanism. Further, instructions in various modules in memory subsystem 2312 may be implemented as: a high-level programming language, an object-oriented programming language, and/or an assembly or machine language. Further, the programming language may be compiled or interpreted, e.g., configurable or configured (used interchangeably in this discussion), to be executed by the processing subsystem 2310.
Further, the memory subsystem 2312 may include mechanisms for controlling access to memory. In some embodiments, the memory subsystem 2312 includes a memory hierarchy including one or more caches coupled to memory in the electronic device 2300. In some of these embodiments, one or more caches are located in processing subsystem 2310.
In some embodiments, memory subsystem 2312 is coupled to one or more high capacity mass storage devices (not shown). For example, the memory subsystem 2312 may be coupled to a magnetic or optical drive, a solid state drive, or other type of mass storage device. In these embodiments, memory subsystem 2312 may be used by electronic device 2300 as fast-access memory for frequently used data, while mass storage devices are used to store less frequently used data.
The network subsystem 2314 includes one or more devices configured to couple to and communicate (i.e., perform network operations) over a wired and/or wireless network, including: control logic 2316, interface circuitry 2318, and an associated antenna 2320. (fig. 23 includes an antenna 2320. in some embodiments, the electronic device 2300 includes one or more nodes, e.g., node 2308, e.g., pads, which may be coupled to the antenna 2320. thus, the electronic device 2300 may or may not include the antenna 2320.) for example, the network subsystem 2314 may include a bluetooth network system, a cellular network system (e.g., a 3G/4G network such as UMTS, LTE, etc.), a Universal Serial Bus (USB) network system, a network system based on standards described in IEEE 802.11 (e.g., a Wi-Fi network system), an ethernet network system, and/or other network systems. Note that a given combination of one interface circuit 2318 and at least one antenna 2320 may constitute a radio. In some embodiments, the network subsystem 2314 includes a wired interface, such as an HDMI interface 2330.
The network subsystems 2314 include processors, controllers, radios/antennas, jacks/plugs, and/or other devices for coupling to, communicating on, and processing data and events with each supported network system. Note that the mechanisms for coupling to, communicating on, and processing data and events on the network are sometimes collectively referred to as the "network interfaces" of the network system. Furthermore, in some embodiments, a "network" between electronic devices does not yet exist. Thus, electronic device 2300 may use mechanisms in network subsystem 2314 to perform simple wireless communications between electronic devices, such as transmitting advertisement or beacon frames or packets and/or scanning for advertisement frames or packets transmitted by other electronic devices, as previously described.
Within electronic device 2300, a processing subsystem 2310, a memory subsystem 2312, a network subsystem 2314, an optional feedback subsystem 2334, and an optional monitoring subsystem 2336 are coupled together using a bus 2328. Bus 2328 may include electrical, optical, and/or electro-optical connections such that the subsystems may be used to transfer commands and data between each other. Although only one bus 2328 is shown for clarity, different embodiments may include different numbers or configurations of electrical, optical, and/or electro-optical connections between subsystems.
In some embodiments, electronic device 2300 includes a display subsystem 2326 for displaying information (e.g., a request to clarify the identified environment) on a display, which may include a display driver, an I/O controller, and a display. Note that a wide variety of display types may be used in display subsystem 2326, including: two-dimensional displays, three-dimensional displays (e.g., holographic displays or volumetric displays), head-mounted displays, retina-image projectors, head-up displays, cathode ray tubes, liquid crystal displays, projection displays, electroluminescent displays, electronic paper-based displays, thin film transistor displays, high performance addressing displays, organic light emitting diode displays, surface conduction electron emitter displays, laser displays, carbon nanotube displays, quantum dot displays, interferometric modulator displays, multi-touch screens (also sometimes referred to as touch-sensitive displays), and/or displays based on other types of display technologies or physical phenomena.
In addition, optional feedback subsystem 2334 may include one or more sensor feedback mechanisms or devices, such as: a vibration mechanism or vibration actuator (e.g., an eccentric rotating mass actuator or a linear resonant actuator), a light, one or more speakers, etc., which may be used to provide feedback (e.g., sensory feedback) to a user of the electronic device 2300. Alternatively or additionally, an optional feedback subsystem 2334 may be used to provide sensory input to the user. For example, one or more speakers may output sound, such as audio. Note that the one or more speakers may include an array of sensors (e.g., a phased array of acoustic wave sensors) that may be modified to adjust characteristics of sound output by the one or more speakers. This capability may allow one or more speakers to modify sound in the environment to achieve a desired acoustic experience for the user, for example, by changing the equalization or spectral content, phase, and/or direction of the propagating sound waves.
In some embodiments, optional monitoring subsystem 2336 includes one or more acoustic wave sensors 2338 (e.g., one or more microphones, phased arrays, etc.) that monitor sounds in the environment including electronic device 2300. Acoustic monitoring may allow electronic device 2300 to acoustically characterize an environment, acoustically characterize sounds output by speakers in the environment (e.g., sounds corresponding to audio content), determine a location of a listener, determine a location of a speaker in the environment, and/or measure sounds from one or more speakers that correspond to one or more acoustic characterization patterns that may be used to coordinate playback of the audio content. Additionally, the optional monitoring subsystem 2336 may include a location sensor 2340 that may be used to determine the location of a listener or an electronic device (e.g., a speaker) in the environment.
The electronic device 2300 may be (or may be included in) any electronic device having at least one network interface. For example, the electronic device 2300 may be (or may be included in): desktop computers, laptop computers, sub-notebooks/netbooks, servers, tablet computers, smartphones, cellular phones, smart watches, consumer electronic devices (e.g., televisions, set-top boxes, audio devices, speakers, video devices, etc.), remote controls, portable computing devices, access points, routers, switches, communication devices, testing devices, and/or other electronic devices.
Although electronic device 2300 is described using specific components, in alternative embodiments, different components and/or subsystems may be present in electronic device 2300. For example, electronic device 2300 may include one or more additional processing subsystems, memory subsystems, network subsystems, and/or display subsystems. Further, while one of the antennas 2320 is shown coupled to a given one of the interface circuits 2318, there may be multiple antennas coupled to a given one of the interface circuits 2318. For example, an example of a 3 x 3 radio may include three antennas. Additionally, one or more subsystems may not be present in electronic device 2300. Furthermore, in some embodiments, electronic device 2300 may include one or more additional subsystems not shown in fig. 23. Furthermore, although separate subsystems are shown in fig. 23, in some embodiments, some or all of a given subsystem or component may be integrated into one or more other subsystems or components in electronic device 2300. For example, in some embodiments, program module 2322 is included in operating system 2324.
Further, the circuits and components in electronic device 2300 may be implemented using any combination of analog and/or digital circuits, including: bipolar, PMOS and/or NMOS gates or transistors. Further, the signals in these embodiments may include digital signals having approximately discrete values and/or analog signals having continuous values. In addition, the components and circuits may be single ended or differential, and the power supply may be unipolar or bipolar.
The integrated circuit may implement some or all of the functionality of the network subsystem 2314 (e.g., one or more radios). Further, the integrated circuit may include hardware and/or software mechanisms for transmitting wireless signals from the electronic device 2300, and receiving signals at the electronic device 2300 from other electronic devices. In addition to the mechanisms described herein, radios are generally known in the art and therefore will not be described in detail. In general, the network subsystem 2314 and/or integrated circuits may include any number of radios.
In some embodiments, the network subsystem 2314 and/or integrated circuits include configuration mechanisms (e.g., one or more hardware and/or software mechanisms) that configure a radio to transmit and/or receive on a given channel (e.g., a given carrier frequency). For example, in some embodiments, a configuration mechanism may be used to switch a radio from monitoring and/or transmitting on a given channel to monitoring and/or transmitting on a different channel. (note that "monitoring," as used herein, includes receiving signals from other electronic devices and possibly performing one or more processing operations on the received signals, e.g., determining whether the received signals include advertising frames or packets, calculating performance metrics, performing spectral analysis, etc.). furthermore, network subsystem 2314 may include at least one port (e.g., HDMI port 2332) to receive information in a data stream and/or to provide information in the data stream to at least one a/V display device 114 (fig. 1), at least one speaker 118 (fig. 1), and/or at least one content source 120 (fig. 1).
Although a Wi-Fi compatible communication protocol is used as an illustrative example, the described embodiments may be used in a variety of network interfaces. Further, while some of the operations in the foregoing embodiments are implemented in hardware or software, in general, the operations in the foregoing embodiments may be implemented in a wide variety of configurations and architectures. Accordingly, some or all of the operations in the foregoing embodiments may be performed in hardware, software, or both. For example, at least some of the operations in the communications techniques may be implemented using program module 2322, operating system 2324 (e.g., a driver for interface circuit 2318), and/or in firmware in interface circuit 2318. Alternatively or additionally, at least some operations in the communication techniques may be implemented in a physical layer (e.g., hardware in the interface circuit 2318).
Further, while the foregoing embodiments include a touch sensitive display in the portable electronic device that is touched by a user (e.g., with a finger or toe or stylus), in other embodiments, the user interface is displayed on the display of the portable electronic device and the user interacts with the user interface without contacting or contacting a surface of the display. For example, time-of-flight measurements, motion sensing (e.g., doppler measurements), or other non-contact measurements that allow for measuring the position, direction of motion, and/or velocity of the user's fingers or toes (or stylus) relative to the position of the one or more virtual command icons may be used to determine the user's interaction with the user interface. In these embodiments, it is noted that a user may activate a given virtual command icon by performing a gesture (e.g., "tapping" their finger in the air without making contact with the surface of the display). In some embodiments, the user navigates through the user interface and/or activates/deactivates a function of one of the components in the system 100 (fig. 1) using spoken commands or instructions (i.e., via voice recognition) and/or based on the location at which they are viewing on the display on the portable electronic device 110 or one of the a/V display devices 114 in fig. 1 (e.g., by tracking the user's gaze or the location at which the user is looking).
Further, although A/V hub 112 (FIG. 1) is shown as a separate component from A/V display device 114 (FIG. 1), in some embodiments, the A/V hub and A/V display device are combined into a single component or a single electronic device.
While the foregoing embodiments illustrate communication techniques with audio and/or video content (e.g., HDMI content), in other embodiments, the communication techniques are used in the context of any type of data or information. For example, communication techniques may be used with home automation data. In these embodiments, a/V hub 112 (fig. 1) may facilitate communication between and control of various electronic devices. Thus, a/V hub 112 (fig. 1) and communication technology may be used to facilitate or implement services in the so-called internet of things.
In the foregoing description, we have referred to "some embodiments". Note that "some embodiments" describe a subset of all possible embodiments, but do not always specify the same subset of embodiments.
The previous description is presented to enable one of ordinary skill in the art to make and use the disclosure, and is provided in the context of a partial application and its requirements. Furthermore, the foregoing descriptions of embodiments of the present disclosure have been presented only for purposes of illustration and description. They are not intended to be exhaustive or to limit the disclosure to the forms disclosed. Thus, many modifications and variations will be apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the disclosure. Additionally, the discussion of the preceding embodiments is not intended to limit the present disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
Claims (20)
1. A coordinating device, comprising:
one or more nodes configured to be communicatively coupled to one or more antennas;
an interface circuit communicatively coupled to the one or more nodes, wherein the coordinating device is configured to:
receiving, from the one or more nodes, input frames associated with an electronic device, wherein a given input frame comprises a transmission time at which the given electronic device transmits the given input frame;
storing a reception time when the input frame is received, wherein the reception time is based on a clock in the coordinating device;
calculating a current time offset between a clock in the electronic device and a clock in the coordinating device based on the receive time and the transmit time of the input frame, wherein the calculation of a given current time offset is based on a given difference between a given transmit time and a given receive time for a given packet, and wherein the calculation is based on one-way communication of timing information between a given electronic device and the coordinating device; and
transmitting, via the one or more nodes, one or more output frames comprising audio content intended for the electronic device and playback timing information, wherein the playback timing information specifies a playback time at which the electronic device is to concurrently playback the audio content based on the current time offset, and
wherein the playback time of the electronic device has a temporal relationship to coordinate playback of the audio content by the electronic device.
2. The coordinating device of claim 1, wherein the temporal relationship has a non-zero value, thereby instructing at least some of the electronic devices to playback audio content having a phase relative to each other by using different values of the playback time.
3. The coordinating device of claim 2, wherein the different playback times are based on an acoustic characterization of the environment.
4. The coordinating device of claim 2, wherein the different playback times are based on desired acoustic characteristics in the environment.
5. The coordinating device of claim 2, wherein the electronic device is located at a vector distance from the coordinating device;
wherein the interface circuit is configured to: determining a magnitude of the vector distance based on the transmit time and the receive time using wireless ranging, and configured to: determining an angle of the vector distance based on an angle of arrival of a wireless signal associated with the input frame; and is
Wherein the different playback times are based on the determined vector distance.
6. The coordinating device of claim 2, wherein the different playback times are based on an estimated location of a listener relative to the electronic device.
7. The coordinating device of claim 6, wherein the interface circuit is further configured to:
receiving, from the one or more nodes, a frame associated with another electronic device; and
calculating the estimated location of the listener based on the received frames.
8. The coordinating device of claim 6, wherein the coordinating device further comprises an acoustic wave sensor configured to perform acoustic measurements of an environment; and is
Wherein the coordinating device is configured to calculate the estimated location of the listener based on the sound measurements.
9. The coordinating device of claim 6, wherein the interface circuit is further configured to: receiving, from the one or more nodes, additional sound measurements of the environment associated with other electronic devices in the environment; and is
Wherein the coordinating device is configured to calculate the estimated location of the listener based on the additional sound measurements.
10. The coordinating device of claim 6, wherein the interface circuit is configured to:
performing a time-of-flight measurement; and
calculating the estimated location of the listener based on the time-of-flight measurements.
11. The coordinating device of claim 1, wherein the electronic device is located at a non-zero distance from the coordinating device; and is
Wherein the current time offset is calculated based on the transmission time and the reception time using wireless ranging by ignoring the distance.
12. The coordinating device of claim 1, wherein the current time offset is further based on a model of clock drift in the electronic device.
13. A non-transitory computer-readable storage medium for use with a coordinating device, the computer-readable storage medium storing a program that, when executed by the coordinating device, causes the coordinating device to coordinate playback of audio content by performing one or more operations comprising:
receiving, from one or more nodes in the coordinating device communicatively coupled to one or more antennas, input frames associated with an electronic device, wherein a given input frame comprises a transmit time at which the given electronic device transmits the given input frame;
storing a reception time when the input frame is received, wherein the reception time is based on a clock in the coordinating device;
calculating a current time offset between a clock in the electronic device and a clock in the coordinating device based on the receive time and the transmit time of the input frame, wherein the calculation of a given current time offset is based on a given difference between a given transmit time and a given receive time for a given packet, and wherein the calculation is based on one-way communication of timing information between a given electronic device and the coordinating device; and
transmitting, via the one or more nodes, one or more output frames including the audio content intended for the electronic device and playback timing information, wherein the playback timing information specifies a playback time at which the electronic device is to concurrently playback the audio content based on the current time offset, and
wherein the playback time of the electronic device has a temporal relationship to coordinate playback of the audio content by the electronic device.
14. The non-transitory computer-readable storage medium of claim 13, wherein the temporal relationship has a non-zero value, thereby instructing at least some of the electronic devices to playback audio content having phases relative to each other by using different values of the playback time; and is
Wherein the different playback times are based on an acoustic characterization of the environment.
15. The non-transitory computer-readable storage medium of claim 14, wherein the temporal relationship has a non-zero value, thereby instructing at least some of the electronic devices to playback audio content having phases relative to each other by using different values of the playback time; and is
Wherein the different playback times are based on desired acoustic characteristics in the environment.
16. The non-transitory computer-readable storage medium of claim 14, wherein the temporal relationship has a non-zero value, thereby instructing at least some of the electronic devices to playback audio content having phases relative to each other by using different values of the playback time;
wherein the electronic device is located at a vector distance from the coordinating device;
wherein the one or more operations comprise:
determining a magnitude of the vector distance based on the transmit time and the receive time using wireless ranging; and
determining an angle of the vector distance based on an angle of arrival of a wireless signal associated with the input frame; and is
Wherein the different playback times are based on the determined vector distance.
17. The non-transitory computer-readable storage medium of claim 14, wherein the temporal relationship has a non-zero value, thereby instructing at least some of the electronic devices to playback audio content having phases relative to each other by using different values of the playback time; and is
Wherein the different playback times are based on an estimated position of the listener relative to the electronic device.
18. The non-transitory computer-readable storage medium of claim 13, wherein the one or more operations comprise:
performing a time-of-flight measurement; and
an estimated location of the listener is calculated based on the time-of-flight measurements.
19. The non-transitory computer-readable storage medium of claim 13, wherein the current time offset is further based on a model of clock drift in the electronic device.
20. A method for coordinating playback of audio content, comprising:
by the coordinating means:
receiving, from one or more nodes in the coordinating device communicatively coupled to one or more antennas, input frames associated with an electronic device, wherein a given input frame comprises a transmit time at which the given electronic device transmits the given input frame;
storing a reception time when the input frame is received, wherein the reception time is based on a clock in the coordinating device;
calculating a current time offset between a clock in the electronic device and a clock in the coordinating device based on the receive time and transmit time of the input frame, wherein the calculation of a given current time offset is based on a given difference between a given transmit time and a given receive time for a given packet, and wherein the calculation is based on one-way communication of timing information between a given electronic device and the coordinating device; and
transmitting, via the one or more nodes, one or more output frames comprising audio content intended for the electronic device and playback timing information, wherein the playback timing information specifies playback times when the electronic device is to simultaneously playback the audio content based on the current time offset, and
wherein the playback time of the electronic device has a temporal relationship to coordinate playback of the audio content by the electronic device.
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US62/433,237 | 2016-12-13 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK40015116A HK40015116A (en) | 2020-08-28 |
| HK40015116B true HK40015116B (en) | 2021-08-13 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7650907B2 (en) | Radio Tuning of Audio Sources | |
| HK40015116A (en) | Wireless coordination of audio sources | |
| HK40015116B (en) | Wireless coordination of audio sources |