Digital assistant extensibility to third party applications
1. A computing device supporting a digital assistant, comprising:
one or more processors;
a User Interface (UI) configured to enable interaction between the digital assistant and a user of the computing device; and
one or more memory devices storing computer-readable instructions that, when executed by the one or more processors, cause the computing device to:
supporting a plurality of application extensions, each of the application extensions being uniquely associated with a respective application of a plurality of applications that are executable on the computing device,
instantiating a unique event handler in each of the plurality of application extensions, each unique event handler configured to process events associated with one of user inputs, actions, or behaviors of a corresponding application,
enabling, by the application extension, receipt of application data exposed by a database associated with a respective application, wherein the application extension is each configured with a unique manifest of application-specific resources that are loaded into a runtime environment and available to the digital assistant at runtime, the application-specific resources including commands that invoke application operations,
receiving a series of inputs from the user at the UI,
monitoring user behavior and user interaction with the computing device to develop context awareness applicable to a range of user inputs,
identifying at least two different applications, data of which is obtained from respective associated application databases, wherein a first application is identified based on user input and context awareness, and a second application is identified based on user input, context awareness, and the determined first application,
associating an event with the series of user inputs, user behaviors, and the user interaction;
passing the event to the respective event handler in the application extension associated with the identified application,
in response to the event being received by the event handler in the respective application extension associated with the selected application, invoking execution of the identified application from the respective application extension to retrieve data from the associated application database relating to the user input in the series, and
operating the digital assistant through the UI to respond to the series of user inputs using the obtained data.
2. The computing device of claim 1, wherein the input comprises one of a request for information or a request for an action.
3. The computing device of claim 1, further comprising parsing the user input to determine keywords included in one of the application-specific resources of the application extension manifest.
4. The computing device of claim 1, wherein the application-specific resources further comprise graphics and audio for a respective application.
5. The computing device of claim 1, wherein the application extension is authored by a third party developer.
6. The computing device of claim 1, wherein the UI supports one or more of a tangible UI, a natural language UI, or a gestural UI.
7. A method for extending functionality provided by a digital assistant running on a computing device, comprising:
configuring a plurality of application extensions to interoperate with a respective plurality of associated applications accessible on the computing device, wherein each application extension of the plurality of application extensions respectively includes a unique event handler configured to process events associated with one of user input, actions, or behaviors;
receiving input to the digital assistant from a user of the computing device;
developing, using the digital assistant, context awareness for the input by observing behaviors and actions associated with the user;
associating an event with the user input, user behavior, and the user interaction;
identifying a first application based on user input and the context awareness;
communicating the event from the digital assistant to an event handler instantiated in an application extension associated with the identified first application;
in response to receiving the event, invoking execution of the first application from an associated first application extension, wherein the executing first application interoperates with an associated first application database to obtain data responsive to the user input;
identifying a second application based on the user input, the context awareness, and the identified first application;
communicating the event from the digital assistant to an event handler instantiated in an application extension associated with the identified second application;
in response to receiving the event, invoking execution of the second application from an associated second application extension, wherein the executing second application interoperates with an associated second application database to obtain data responsive to the user input;
adding data obtained from the first application database and the second application database to the digital assistant database, the digital assistant database configured to store data usable by the digital assistant; and
operating the digital assistant on the computing device to interact with the digital assistant database to take an action or answer a question in response to the input, action, or behavior.
8. The method of claim 7, further comprising configuring the digital assistant to monitor context data associated with the computing device or the user and utilize the context data in taking actions and answering questions.
9. The method of claim 8, wherein the context data comprises one or more of: time/date, location of the user or device, language, schedule, applications installed on the device, user preferences, user behavior, user activity, stored contacts, call history, messaging history, browsing history, device type, device capabilities, or communication network type.
10. The method of claim 7, wherein at least a portion of an application extension is instantiated locally and the application extension is arranged to interact with a remote service, wherein at least a portion of the response data is obtained from the remote service.
11. The method of claim 10, wherein the remote service relates to one of a language, vocabulary, user preferences, or context.
12. The method of claim 7, further comprising operating the computing device to present a user experience on the computing device, the user experience supported by an application, the digital assistant, or both the digital assistant and the application.
13. The method of claim 7, wherein the digital assistant operation integrates data from the application extension into a local digital assistant user experience.
14. The method of claim 7, wherein the application extension is configured as a plug-in to a digital assistant.
15. One or more computer-readable storage media storing instructions that, when executed by a computing device, cause the computing device to:
presenting a digital assistant to a user of the computing device through a User Interface (UI) supported on the computing device;
configuring a plurality of application extensions to enable user interaction with a corresponding plurality of applications executable on the computing device through the digital assistant UI, wherein each application extension of the plurality of application extensions includes a unique event handler, respectively;
observing user interactions with the digital assistant;
mapping observed user interactions with the digital assistant to the plurality of application extensions to identify respective applications for execution on the computing device;
passing events for observed user interactions with the digital assistant to respective event handlers in the mapped application extension;
operating programming contained in the mapped application extension to provide an application-specific context to the digital assistant;
in response to receiving the event, invoking execution of the respective identified application using the mapped application extension to receive data from an application database associated with the executing application; and
operating the digital assistant to respond to the observed user interaction using the received data and application-specific context.
16. The one or more computer-readable storage media of claim 15, wherein the instructions further cause the computing device to collect data from the application extension, wherein the data is used to personalize a user experience.
17. The one or more computer-readable storage media of claim 15, wherein the observed interactions are used to determine a context of the digital assistant applying when rendering a user experience.
18. The one or more computer-readable storage media of claim 16, wherein the user experience comprises one or more of: sharing contact information, sharing stored contacts, scheduling a meeting, viewing a user's calendar, scheduling a reminder, making a phone call, operating a device, playing a game, making a purchase, taking notes, scheduling an alarm or wake up reminder, sending a message, checking social media for updates, crawling a website, interacting with a search service, sharing or displaying a file, sending a link to a website, or sending a link to a resource.
19. The one or more computer-readable storage media of claim 15, wherein an application extension comprises application-specific logic comprising one of a script or a programming construct.
20. One or more computer-readable storage media as recited in claim 15, wherein the instructions further cause the computing device to expose one or more databases associated with applications to the digital assistant using corresponding application extensions.
Background
Digital assistants can provide various features to device users and can make it easier to interact with the device to perform tasks, obtain information, and maintain connections with friends and colleagues by using voice interactions. Typically, a user may interact with the digital assistant using speech input, and the digital assistant may speak to the user using his own voice. The current features perform in a satisfactory manner for many usage scenarios, but the increased functionality may make the digital assistant more advantageous and productive.
This background is provided to introduce a brief context for the summary and detailed description that follows. This background is not intended to be an aid in determining the scope of the claimed subject matter, nor is it intended to limit the claimed subject matter to implementations that solve any or all of the disadvantages or artifacts set forth above.
SUMMARY
Digital assistants supported on devices such as smartphones, tablets, Personal Computers (PCs), game consoles, and the like include extensibility clients that interface with application extensions built by third party developers so that various aspects of the application user experience, content, or features can be integrated into the digital assistant and rendered as a native digital assistant experience. The application extensions may use various services provided from cloud-based and/or local sources (such as language/vocabulary, user preferences, and context services) that add intelligence and contextual relevance while enabling the extensions to insert and operate seamlessly within the context of the digital assistant. The application extensions can also access and utilize the general digital assistant functionality, data structures, and libraries exposed by the services and use the programming features captured in the extensions to implement application domain specific context and behavior. This extensibility to third party applications can broaden the range of information databases available to the digital assistant to answer questions and perform actions for the user.
The digital assistant extensibility of the present invention enables increased user efficiency in obtaining information and performing tasks using a digital assistant and improves overall user interaction performance with a device. By expanding the information database available to the digital assistant, scalability improves the quality of the answers and enables the support of a broader and more complete set of responses and actions on the device. This may reduce the number of attempts to obtain the desired information, which may reduce the likelihood of unintended input to the device that may result in additional resource consumption and user frustration. Moreover, scalability enables devices to more efficiently utilize available computing resources, including network bandwidth, processing cycles, memory, and battery life in some cases. For example, data maintained by a digital assistant describing context and user behavior may be used to make an application operate more efficiently in delivering customized content, information, and user experiences, which may reduce network bandwidth requirements and load on processing, storage, and memory resources on a device.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure. It will be appreciated that the above-described subject matter may be implemented as a computer-controlled apparatus, a computer process, a computing system, or as an article of manufacture such as one or more computer-readable storage media. These and various other features will become apparent from a reading of the following detailed description and a review of the associated drawings.
Description of the drawings
FIG. 1 shows an illustrative digital assistant including an extensibility client that interfaces with third party applications and extensions;
FIG. 2 shows an illustrative computing environment in which devices may communicate and interact with application services over a network;
FIG. 3 shows a local application and/or browser interacting with a remote application service;
FIG. 4 shows illustrative inputs to the digital assistant and illustrative classifications of general functions that may be performed by the digital assistant;
FIGS. 5, 6, and 7 show illustrative interfaces between a user and a digital assistant;
FIG. 8 shows an illustrative hierarchical architecture including a digital assistant component, an extensibility client, and an application extension;
FIG. 9 shows an illustrative service exposed by a digital assistant extensibility service;
FIG. 10 shows illustrative interactions between an application extension and an operating system on a device during application installation;
FIG. 11 shows illustrative interactions between application extensions and a digital assistant extensibility client during application runtime;
FIG. 12 shows three illustrative application extensions installed on a device;
FIGS. 13, 14, and 15 show illustrative digital assistant extensibility user experience scenarios;
FIGS. 16, 17 and 18 show illustrative methods that may be performed when implementing the digital assistant extensibility of the present invention;
FIG. 19 is a simplified block diagram of an illustrative computer system, such as a Personal Computer (PC), that may be used in part to implement the digital assistant extensibility of the present invention;
FIG. 20 shows a block diagram of an illustrative device that may be used in part to implement the digital assistant extensibility of the present invention;
FIG. 21 is a block diagram of an illustrative mobile device; and
FIG. 22 is a block diagram of an illustrative multimedia console.
Like reference symbols in the various drawings indicate like elements. Elements are not drawn to scale unless otherwise indicated.
Detailed Description
Fig. 1 shows an overview of a digital assistant extensibility arrangement 100 in which a user 105 employs a device 110 that hosts a digital assistant 112. Digital assistant 112 supports extensibility client 114, which extensibility client 114 typically interoperates over network 115 with extensibility service 118 supported by remote digital assistant service 130. Alternatively, in some cases, the extensibility service may be instantiated in part or in whole as a local service 135. Digital assistant extensibility client 114 is configured to enable interaction with application extensions 140 such that various aspects of the user experience, features, and content of an application can be integrated with digital assistant 112. Typically, extensibility is implemented such that applications can render user experiences, features, and content with similar and consistent sound, appearance, and feel in most situations using digital assistants, such that transitions between applications and digital assistants are handled smoothly and experiences are rendered seamlessly to users.
In some cases where the application author, developer, or provider is an entity that is not the same as the provider of the digital assistant 112, the extension 140 may be associated with a third party application 150. First party applications may also be supported in some implementations. In some cases, digital assistant extensibility service 118 can support direct interaction with application 150, as indicated by line 152 of FIG. 1.
Various details of an illustrative implementation of digital assistant extensibility are now presented. Fig. 2 shows an illustrative environment 200 in which various users 105 use various devices 110 that communicate over a network 115. Each device 110 includes an instance of a digital assistant 112. Device 110 may support voice telephony capabilities in some cases and typically supports data consumption applications such as internet browsing and multimedia (e.g., music, video, etc.) consumption, among various other features. The device 110 may include, for example, user equipment, mobile phones, cellular phones, feature phones, tablet computers, and smart phones that users typically use to place and receive voice and/or multimedia (i.e., video) calls, engage in messaging (e.g., texting) and email communications, use applications and access services that utilize data, browse the world wide web, and so forth.
Other types of electronic devices are also contemplated for use within environment 100, including handheld computing devices, PDAs (personal digital assistants), portable media players, devices using headsets and headsets (e.g., bluetooth-compatible devices), large screen devices (i.e., combination smart phones/tablets), wearable computers, navigation devices such as GPS (global positioning system) systems, laptop PCs (personal computers), desktop computers, multimedia consoles, gaming systems, and the like. In the discussion that follows, use of the term "device" is intended to cover all devices equipped with communication capabilities and having connectivity capabilities to the communication network 115.
Various devices 110 in environment 100 may support different features, functions, and capabilities (collectively referred to herein as "features"). Some of the features supported on a given device may be similar to those supported on other devices, but other features may be unique to the given device. The degree of overlap and/or difference between features supported on the various devices 110 may vary from implementation to implementation. For example, some devices 100 may support touch controls, gesture recognition, and voice commands, while other devices may implement a more limited UI. Some devices may support video consumption and internet browsing, while other devices may support more limited media processing and network interface features.
Accessory devices 218 (such as wristbands and other wearable devices) may also be present in the environment 200. Such accessory devices 218 are typically adapted to interoperate with the device 110 using a short-range communication protocol (e.g., bluetooth) to support functions such as monitoring of the wearer's physiology (e.g., heart rate, number of steps taken, calories burned, etc.) and environmental conditions (temperature, humidity, Ultraviolet (UV) levels, etc.) and presenting notifications from the coupling device 110.
Device 110 may generally utilize network 115 to access and/or implement various user experiences. The network may include any of a variety of network types and network infrastructures, including various combinations or subcombinations of: cellular networks, satellite networks, IP (Internet protocol) networks (such as Wi-Fi and Ethernet), Public Switched Telephone Networks (PSTN), and/or short-range networks (such as Bluetooth)A network). The network infrastructure may be supported, for example, by mobile operators, businesses, Internet Service Providers (ISPs), telephone service providers, data service providers, and so forth.
The network 115 may utilize portions of the internet or include interfaces that support connections to the internet such that the device 110 may access content provided by one or more content providers and also render user experiences supported by various application services 225. The application services 225 may each support a wide variety of applications, such as social networking, maps, news and information, entertainment, travel, productivity, finance, and so forth. A digital assistant service 130 (described in more detail below) is also present in the computing environment 200.
As shown in fig. 3, device 100 may generally include a local component (such as browser 305) or one or more applications 150 that can facilitate interaction with application service 225. For example, in some scenarios, the user 105 may launch a locally-executing application that communicates over a network to a service in order to retrieve data to implement various features and functions, provide information, and/or support a given user experience that can be rendered on a user interface of the local device 110. In some scenarios, an application may operate locally on a device without interfacing with a remote service.
Fig. 4 shows illustrative classifications of functions 400, which functions 400 may generally be supported by the digital assistant 112 either natively or in combination with the application 150. Inputs to the digital assistant 112 may generally include user input 405, data 410 from internal sources, and data 415 from external sources that may include third-party content 418. For example, the data 410 from the internal source may include the current location of the device 110 as reported by a GPS (Global positioning System) component or some other location-aware component on the device 110. The data 415 provided by the external source includes, for example, data provided by an external system, database, service, or the like.
Various inputs may be used alone or in various combinations to enable the digital assistant 112 to utilize the context data 420 in its operation. The contextual data may include, for example, time/date, location of the user, language, schedule, applications installed on the device, preferences of the user, behaviors of the user (where such behaviors are monitored/tracked with notification to the user and consent of the user), stored contacts (including in some cases links to a social graph of local or remote users (such as those maintained by an external social networking service), call history, messaging history, browsing history, device type, device capabilities, communication network type and/or features/functions provided therein, mobile data plan constraints/limitations, data associated with other parties to the communication (e.g., their schedule, preferences, etc.), and so forth.
As shown, functionality 400 illustratively includes interacting 425 with a user (e.g., through a natural language UI and other graphical UIs); performing tasks 430 (e.g., making appointment records on a user's calendar, sending messages and emails, etc.); providing services 435 (e.g., answering questions from a user, drawing a route to a destination, setting an alarm, forwarding a notification, reading an email, news, blog, etc.); collecting information 440 (e.g., finding information about books or movies requested by the user, locating the nearest italian restaurant, etc.); operating the device 445 (e.g., setting preferences, adjusting screen brightness, turning wireless connections (e.g., Wi-Fi and bluetooth) on and off, communicating with other devices, controlling smart appliances, etc.); and perform various other functions 450. The list of functions 400 is not intended to be exhaustive and other functions may be provided by the digital assistant 112 and/or the application 150 as they may be needed for a particular implementation of the digital assistant extensibility of the present invention.
Depending on the features and functions supported by a given device 110, a user may generally interact with the digital assistant 112 in several ways. For example, as shown in fig. 5, the digital assistant 112 may present a tangible user interface 505 that enables the user 105 to leverage the physical interactions 510 to support the user experience on the device 110. Such physical interaction may include manipulating physical and/or virtual controls (such as buttons, menus, keyboards, etc.) using touch-based input (similar to tapping, flicking, dragging, etc.) on a touch screen, and so forth.
In some implementations, the digital assistant 112 can present a natural language user interface 605, shown in fig. 6, or alternatively a voice command-based user interface (not shown) through which a user employs speech 610 to provide various inputs to the device 110.
In other implementations, the digital assistant 112 may present a gestural user interface 705, shown in fig. 7, through which the user 105 may employ a gesture 710 to provide input to the device 110. Note that in some cases, various combinations of user interfaces may be utilized, where a user may interact with the digital assistant 112 and the device 110, for example, employing both voice and physical input. User gestures may be sensed using various techniques, such as optical sensing, touch sensing, proximity sensing, and the like.
Fig. 8 shows an illustrative hierarchical architecture 800 that may be instantiated on a given device 110. Architecture 800 is typically implemented in software, but in some cases may also be implemented in a combination of software, firmware, and/or hardware. Architecture 800 is arranged into multiple layers and includes an application layer 805, an OS (operating system) layer 810, and a hardware layer 815. The hardware layer 815 provides an abstraction of the various hardware used by the device 110 (e.g., input and output devices, networking and radio hardware, etc.) to the layers above the hardware layer. In this illustrative example, the hardware layer supports a microphone 820 and an audio endpoint 825, which may include, for example, the device's built-in speakers, wired or wireless headphones/earphones, external speakers/devices, and so forth.
The application layer 805 in this illustrative example supports various applications 150 (e.g., web browsers, mapping applications, email applications, news applications, etc.) as well as the digital assistant extensibility client 114. Applications are typically implemented using locally executed code. However, in some cases, these applications may rely on services and/or remote code execution provided by remote servers or other computing platforms, such as those supported by service providers or other cloud-based resources. While the digital assistant extensibility client 114 is shown herein as a component instantiated in the application layer 805, it can be appreciated that the functionality provided by a given application can be implemented in whole or in part using components supported in the OS or hardware layers.
The OS layer 810 supports the digital assistant 112 and various other OS components 855. In a typical implementation, the digital assistant 112 can interact with the digital assistant service 130, as indicated by line 860. That is, digital assistant 112 may, in some implementations, partially or fully utilize remote code execution supported at service 130, or use other remote resources. Further, it may utilize other OS components 855 (and/or other components instantiated in other layers of architecture 800) and/or interact with other OS components 855 as may be needed to implement the various features and functions described herein. In some implementations, some or all of the functionality supported by the digital assistant extensibility client 114 can be integrated into the digital assistant, as shown by the dashed rectangle in FIG. 8. As mentioned above, the digital assistant 112 can also interact with extensibility services that are instantiated partially or wholly locally on the device 110. For example, a service may apply local resources and implement local logic to support various user experiences and features.
FIG. 9 shows an illustrative service 900 that may be exposed to application extensions 140 by remote digital assistant extensibility service 118 and local clients 114. The service 900 can also be partially or fully implemented and/or rendered locally on the device 110 by the extensibility client 114 and/or the local digital assistant extensibility service 135 (FIG. 1). Alternatively, some or all of the services are provided directly from extensibility service 118 to the application, in some cases using an interface (not shown) that is remotely accessible. Service 130 may access other services from various providers (such as search service 935) as these other services may be needed to support the provisioning of service 900.
The language and vocabulary service 905 may support expanding the utilization of different languages in providing data and/or services to the digital assistant. For example, some applications may be used in a multi-language setting, while other applications may have regional or global distributions that make it attractive to support multiple languages. The vocabulary service may support utilization of application specific and/or industry specific vocabulary. For example, technical and scientific vocabulary may be supported for applications that process computer and technical news. Thus, the news reading application may access the vocabulary service such that when an article is read aloud to the user 105 through the digital assistant, the particular term is pronounced correctly.
The user preference service 910 may enable extensions to take into account user preferences maintained by the digital assistant when providing data and services. The context service 915 similarly can enable the extension to use context data maintained by the digital assistant. Other services 920 may also be exposed by extensibility service 118 to meet the needs of a particular implementation.
As shown in fig. 10, during installation of the application extension 140 on the device 110, an application package manifest 1015, or similar installation package used to validate and deploy the application, is configured to initiate a request 1020 to access digital assistant resources. Typically, the request describes extensibility points for interaction of the application, descriptions of required capabilities and resources, and the like to facilitate interaction between the application and the operating system 1050 and/or digital assistant components executing thereon.
As shown in fig. 11, during application extension operations at runtime on device 110 in runtime environment 1110, application extension 140 can interface and load manifest 1120, which can include application-specific resources such as graphics, audio, commands, and other information, through Application Programming Interfaces (APIs) 1115 and digital assistant extensibility client 114. For example, the manifest 1120 can include keywords 1122 that can be loaded from the manifest and registered with the digital assistant extensibility client. The registered keywords may be invoked by the user at runtime and the input events may be extended for the appropriate application. The application name is a typical example of a keyword so that a user can direct the digital assistant to launch an application or obtain information, services, content, etc. from a named application by name. During runtime, extensibility client 114 can pass events associated with user input, actions, and behaviors to event handlers 1125 in the application extension. Application extensions may apply logic 1130 (such as scripts and other programming components) to facilitate a particular user experience or user interface through the digital assistant.
Fig. 12 shows three illustrative applications and corresponding extensions installed on device 110. The applications include a movie database application 1205 and extension 1210, an e-commerce application 1215 and extension 1220, and a crowd-sourced commentary application 1225 and extension 1230. It is emphasized that the applications and extensions are intended to be illustrative and that any of a variety of applications and extensions can be utilized in a given scenario.
Fig. 13-15 show illustrative digital assistant extensibility user experiences using the three applications shown in fig. 12 and described in the accompanying text. In fig. 13, user 105 has interaction with a digital assistant 112 (named "Cortana" in this illustrative example) operating on device 110. As shown, the digital assistant may interact with the e-commerce application through its extensions to present information about items on the user's wish list. In fig. 14, the digital assistant may interact with the movie database application through its extensions to find recommended movies for the user. The digital assistant can use its capabilities to track communications in order to find the best way to forward movie information to one of the user's contacts. The user may also invoke a third party application, in this case "Xbox video," by name, so that the digital assistant will interact with the named application in response to the user's request. Here, the Xbox video application extension registers its name as a keyword so that the user can refer to the application by name when interacting with the digital assistant.
In fig. 15, the digital assistant interfaces with the crowd-sourced reviews application through its extension to provide restaurant recommendations to the user. The digital assistant may use information from the review application to provide various services, such as presenting recommendations, forwarding menus, providing routes, and so forth.
Fig. 16 shows a flow diagram of an illustrative method 1600 for operating a digital assistant on a device (e.g., device 110). Unless explicitly stated, the methods or steps shown in the flow charts and described in the accompanying text are not limited to a particular order or sequence. Further, some methods or steps thereof may occur or be performed concurrently, and not all methods or steps need to be performed in such implementations, and some methods or steps may optionally be used, depending on the requirements of a given implementation.
At step 1605, an interface is configured for receiving application-specific services from extensions associated with respective applications operating on the device. At step 1610, the user interface is configured to receive voice commands from a device user. At step 1615, the received input is mapped to a corresponding extension for processing. At step 1620, a digital assistant extensibility service (such as service 900 shown in FIG. 9 and described in accompanying text) can be exposed to the application extension.
At step 1625, the digital assistant extensibility client receives the application-specific service from the extension in response to the device user input. At step 1630, the application-specific service may be rendered such that the cross-application user experience is exposed as a native digital assistant user experience.
Fig. 17 is a flow diagram of an illustrative method 1700 that may be implemented on a device (e.g., device 110). At step 1705, a context-aware digital assistant is presented on the device, where context-awareness can be achieved at least in part by monitoring user behavior and interaction with the device (typically, if the user is notified and agrees). At step 1710, input from a user is received. At step 1715, context awareness is used to deliver the user input to the application extension for processing. At step 1720, the application extension may load application-specific resources, run scripts, and process events. In some cases, the application database may be exposed as part of the service at step 1725. At step 1730, the digital assistant is operated to render the service received from the extension.
Fig. 18 shows an illustrative method 1800 that may be utilized by a service provider. At step 1805, one or more servers at the provider can interoperate with the digital assistant extensibility client running on the local device. At step 1810, digital assistant extensibility services are maintained, and at step 1815 they are provided to application extensions through local extensibility clients.
FIG. 19 is a simplified block diagram of an illustrative computer system 1900 such as a PC, client machine, or server that may be used to implement the digital assistant extensibility of the present invention. The computer system 1900 includes a processor 1905, a system memory 1911, and a system bus 1914 that couples various system components including the system memory 1911 to the processor 1905. The system bus 1914 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, or a local bus using any of a variety of bus architectures. The system memory 1911 includes Read Only Memory (ROM)1917 and Random Access Memory (RAM) 1921. A basic input/output system (BIOS)1925, containing the basic routines that help to transfer information between elements within computer system 1900, such as during start-up, is stored in ROM 1917. The computer system 1900 may also include a hard disk drive 1928 for reading from and writing to an internal hard disk (not shown), a magnetic disk drive 1930 for reading from or writing to a removable magnetic disk 1933 (e.g., a floppy disk), and an optical disk drive 1938 for reading from or writing to a removable optical disk 1943 such as a CD (compact disk), DVD (digital versatile disk), or other optical media. The hard disk drive 1928, magnetic disk drive 1930, and optical disk drive 1938 are connected to the system bus 1914 by a hard disk drive interface 1946, a magnetic disk drive interface 1949, and an optical drive interface 1952, respectively. The drives and their associated computer-readable storage media provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computer system 1900. Although this illustrative example includes a hard disk, a removable magnetic disk 1933 and a removable optical disk 1943, other types of computer-readable storage media that can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video cards, data cartridges, random access memories ("RAMs"), read only memories ("ROMs"), and the like, may also be used in some applications of the digital assistant extensibility of the present invention. Further, as used herein, the term computer-readable media includes one or more instances of a media type (e.g., one or more disks, one or more CDs, etc.). For purposes of this specification and the claims, the phrase "computer readable storage medium" and variations thereof, does not include waves, signals, and/or other transitory and/or intangible communication media.
A number of program modules can be stored on the hard disk, magnetic disk 1933, optical disk 1943, ROM 1917, or RAM 1921, including an operating system 1955, one or more application programs 1957, other program modules 1960, and program data 1963. A user may enter commands and information into the computer system 1900 through input devices such as a keyboard 1966 and pointing device 1968, such as a mouse. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, trackball, touch pad, touch screen, touch-sensitive device, voice command module or device, user motion or user gesture capture device, and/or the like. These and other input devices are often connected to the processor 1905 through a serial port interface 1971 that is coupled to the system bus 1914, but may be connected by other interfaces, such as a parallel port, game port or a Universal Serial Bus (USB). A monitor 1973 or other type of display device may also be connected to the system bus 1914 via an interface, such as a video adapter 1975. In addition to the monitor 1973, personal computers typically include other peripheral output devices (not shown), such as speakers and printers. The illustrative example shown in FIG. 19 also includes a host adapter 1978, a Small Computer System Interface (SCSI) bus 1983, and an external storage device 1976 connected to the SCSI bus 1983.
Computer system 1900 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 1988. The remote computer 1988 may alternatively be another personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer system 1900, although only a single representative remote memory/storage device 1990 has been illustrated in fig. 19. The logical connections depicted in FIG. 19 include a Local Area Network (LAN)1993 and a Wide Area Network (WAN) 1995. Such networking environments are commonly deployed in, for example, offices, enterprise-wide computer networks, intranets, and the Internet.
When used in a LAN networking environment, the computer system 1900 is connected to the local network 1993 through a network interface or adapter 1996. When used in a WAN networking environment, the computer system 1900 typically includes a broadband modem 1998, network gateway, or other means for establishing communications over the wide area network 1995, such as the Internet. A broadband modem 1998, which may be internal or external, is connected to the system bus 1914 via the serial port interface 1971. In a networked environment, program modules depicted relative to the computer system 1900, or portions thereof, may be stored in the remote memory storage device 1990. Note that the network connections shown in fig. 19 are illustrative, and other means for establishing a communications link between the computers may be used, depending on the specific requirements of the application for the extensibility of the digital assistant of the present invention.
Fig. 20 shows an illustrative architecture 2000 of a device capable of executing the various components described herein for providing the digital assistant extensibility of the present invention. Thus, the architecture 2000 illustrated by fig. 20 shows the following architecture: the architecture may be adapted for use with a server computer, mobile phone, PDA, smart phone, desktop computer, netbook computer, tablet computer, GPS device, game console, and/or laptop computer. The architecture 2000 may be used to perform any aspect of the components presented herein.
The architecture 2000 shown in fig. 20 includes a CPU (central processing unit) 2002, a system memory 2004 including a RAM 2006 and a ROM 2008, and a system bus 2010 coupling the memory 2004 to the CPU 2002. A basic input/output system containing the basic routines that help to transfer information between elements within the architecture 2000, such as during startup, is stored in the ROM 2008. The architecture 2000 also includes a mass storage device 2012 for storing software code or other computer-executed code used to implement applications, file systems, and operating systems.
The mass storage device 2012 is connected to the CPU 2002 through a mass storage controller (not shown) connected to the bus 2010. The mass storage device 2012 and its associated computer-readable storage media provide non-volatile storage for the architecture 2000.
Although the description of computer-readable storage media contained herein refers to a mass storage device, such as a hard disk or CD-ROM drive, it should be appreciated by those skilled in the art that computer-readable storage media can be any available storage media that can be accessed by the architecture 2000.
By way of example, and not limitation, computer-readable storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. For example, computer-readable storage media includes, but is not limited to, RAM, ROM, EPROM (erasable programmable read-only memory), EEPROM (electrically erasable programmable read-only memory), flash memory or other solid state memory technology, CD-ROM, DVD, HD-DVD (high definition DVD), BLU-RAY, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the architecture 2000.
According to various embodiments, the architecture 2000 may operate in a networked environment using logical connections to remote computers through a network. The architecture 2000 may connect to a network through a network interface unit 2016 connected to a bus 2010. It should be appreciated that the network interface unit 2016 may also be utilized to connect to other types of networks and remote computer systems. The architecture 2000 may also include an input/output controller 2018 for receiving and processing input from a number of other devices, including a keyboard, mouse, or electronic stylus (not shown in FIG. 20). Similarly, the input/output controller 2018 may provide output to a display screen, a printer, or other type of output device (also not shown in FIG. 20).
It should be appreciated that the software components described herein may, when loaded into the CPU 2002 and executed, transform the CPU 2002 and the overall architecture 2000 from a general-purpose computing system into a special-purpose computing system customized to facilitate the functionality presented herein. The CPU 2002 may be constructed with any number of transistors or other discrete circuit elements (which may individually or collectively assume any number of states). More specifically, the CPU 2002 may operate as a finite state machine in response to executable instructions contained in the software modules disclosed herein. These computer-executable instructions may transform the CPU 2002 by specifying how the CPU 2002 transitions between states, thereby transforming the transistors or other discrete hardware elements that make up the CPU 2002.
Encoding the software modules presented herein may also transform the physical structure of the computer-readable storage media presented herein. The particular transformation of physical structure may depend on various factors, in different implementations of this description. Examples of such factors may include, but are not limited to: the technology used to implement the computer-readable storage medium, whether the computer-readable storage medium is characterized as a primary or secondary memory, and so forth. For example, if the computer-readable storage medium is implemented as semiconductor-based memory, the software disclosed herein may be encoded on the computer-readable storage medium by transforming the physical state of the semiconductor memory. For example, software may transform the state of transistors, capacitors, or other discrete circuit elements that make up a semiconductor memory. Software may also transform the physical state of these components to store data thereon.
As another example, the computer-readable storage media disclosed herein may be implemented using magnetic or optical technology. In such implementations, the software presented herein may transform the physical state of magnetic or optical media when the software is encoded therein. These transformations may include altering the magnetic properties of particular locations within a given magnetic medium. These transformations may also include altering the physical features or properties of particular locations within a given optical medium to change the optical properties of those locations. Other transformations of physical media are possible without departing from the scope and spirit of the present description, with the foregoing examples provided only to facilitate this discussion.
In view of the above, it should be appreciated that many types of physical transformations take place in the architecture 2000 in order to store and execute the software components presented herein. It should also be understood that the architecture 2000 may include other types of computing devices, including handheld computers, embedded computer systems, smart phones, PDAs, and other types of computing devices known to those skilled in the art. It is also contemplated that the architecture 2000 may not include all of the components shown in fig. 20, may include other components not explicitly shown in fig. 20, or may utilize an architecture completely different from that shown in fig. 20.
Fig. 21 is a functional block diagram of an illustrative device 110, such as a mobile phone or smart phone, including various optional hardware and software components, shown generally at 2102. Any component 2102 in the mobile device may communicate with any other component, but not all connections are shown for ease of illustration. The mobile device can be any of a variety of computing devices (e.g., cell phone, smart phone, handheld computer, PDA, etc.) and can allow wireless two-way communication with one or more mobile communication networks 2104, such as a cellular or satellite network.
The illustrated device 110 may include a controller or processor 2110 (e.g., a signal processor, microprocessor, microcontroller, ASIC (application specific integrated circuit), or other control and processing logic circuitry) for performing tasks such as signal coding, data processing, input/output processing, power control, and/or other functions. The operating system 2112 may control the allocation and use of the components 2102 (including power states, locked states, and unlocked states) and provide support for one or more application programs 2114. The application programs may include public mobile computing applications (e.g., image capture applications, email applications, calendars, contact managers, web browsers, messaging applications), or any other computing application.
The illustrated device 110 may include a memory 2120. Memory 2120 can include non-removable memory 2122 and/or removable memory 2124. The non-removable memory 2122 may include RAM, ROM, flash memory, a hard disk, or other well-known memory storage technologies. The removable memory 2124 may include flash memory or a Subscriber Identity Module (SIM) card, which is well known in GSM (global system for mobile communications) systems, or other well-known memory storage technologies, such as "smart cards. Memory 2120 may be used to store data and/or code for running operating system 2112 and application programs 2114. Example data may include web pages, text, images, sound files, video data, or other data sets to be sent to and/or received from one or more network servers or other devices via one or more wired or wireless networks.
Memory 2120 may also be arranged as or include one or more computer-readable storage media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. For example, computer-readable storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM (compact disc ROM), DVD (digital versatile disc), HD-DVD (high definition DVD), BLU-RAY, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by device 110.
The memory 2120 can be used to store a subscriber identifier, such as an International Mobile Subscriber Identity (IMSI), and an equipment identifier, such as an International Mobile Equipment Identifier (IMEI). Such identifiers may be transmitted to a network server to identify the user and the device. Device 110 may support one or more input devices 2130; such as a touch screen 2132; a microphone 2134 for implementing implementation of voice input for voice recognition, voice commands, and the like; a camera 2136; a physical keyboard 2138; a trackball 2140; and/or proximity sensors 2142; and one or more output devices 2150 such as speakers 2152 and one or more displays 2154. In some cases, other input devices (not shown) using gesture recognition may also be employed. Other possible output devices (not shown) may include piezoelectric or tactile output devices. Some devices may be used for more than one input/output function. For example, the touchscreen 2132 and the display 2154 can be combined within a single input/output device.
The wireless modem 2160 may be coupled to an antenna (not shown) and may support bidirectional communication between the processor 2110 and external devices, as is well understood in the art. The modem 2160 is shown generically and may include a cellular modem and/or other radio-based modem (e.g., bluetooth 2164 or Wi-Fi 2162) for communicating with the mobile communications network 2104. The wireless modem 2160 is typically configured to communicate with one or more cellular networks, such as a GSM network, for data and voice communications within a single cellular network, between multiple cellular networks, or between a device and the Public Switched Telephone Network (PSTN).
The device may further include at least one input/output port 2180, a power supply 2182, a satellite navigation system receiver 2184 (such as a GPS receiver), an accelerometer 2186, a gyroscope (not shown), and/or a physical connector 2190, which may be a USB port, an IEEE1394 (firewire) port, and/or an RS-232 port. The illustrated components 2102 are not required or all-inclusive, as any components can be deleted and other components can be added.
FIG. 22 is a multimedia console 1104Is described. Multimedia console 1104Including having a level 1 cacheA Central Processing Unit (CPU)2201 storing 2202, a level 2 cache 2204, and a flash ROM (read only memory) 2206. The level 1 cache 2202 and the level 2 cache 2204 temporarily store data and thus reduce the number of memory access cycles, thereby improving processing speed and throughput. The CPU 2201 may be configured with more than one core and thus have additional level 1 cache 2202 and level 2 cache 2204. The flash ROM 2206 may be stored on the multimedia console 1104Executable code that is loaded during an initial phase of a boot process at power-on.
A Graphics Processing Unit (GPU)2208 and a video encoder/video codec (coder/decoder) 2214 form a video processing pipeline for high speed and high resolution graphics processing. Data is carried from the GPU2208 to the video encoder/video codec 2214 via a bus. The video processing pipeline outputs data to an a/V (audio/video) port 2240 for transmission to a television or other display. A memory controller 2210 is connected to the GPU2208 to facilitate processor access to various types of memory 2212, such as, but not limited to, a RAM.
The multimedia console 1104 includes an I/O controller 2220, a system management controller 2222, an audio processing unit 2223, a network interface controller 2224, a first USB (Universal Serial bus) host controller 2226, a second USB controller 2228 and a front panel I/O subassembly 2230 that are preferably implemented on a module 2218. The USB controllers 2226 and 2228 serve as hosts for peripheral controllers 2242(1) and 2242(2), a wireless adapter 2248, and an external memory device 2246 (e.g., flash memory, external CD/DVD ROM drive, removable media, etc.). The network interface controller 2224 and/or wireless adapter 2248 provide access to a network (e.g., the Internet, home network, etc.) and may be any of a wide variety of various wired or wireless adapter components including an Ethernet card, a modem, a Bluetooth module, a cable modem, and the like.
System memory 2243 is provided to store application data that is loaded during the boot process. A media drive 2244 is provided and may comprise a DVD/CD drive, hard drive, or other removable media drive, among others. Media drive2244 for the multimedia console 1104And may be internal or external. Application data may be accessed via the media drive 2244 for the multimedia console 1104Execution, playback, etc. The media drive 2244 is connected to the I/O controller 2220 via a bus, such as a Serial ATA bus or other high speed connection (e.g., IEEE 1394).
System management controller 2222 provides and ensures multimedia console 1104Various service functions related to availability of (a). The audio processing unit 2223 and the audio codec 2232 form a corresponding audio processing pipeline with high fidelity and stereo processing. Audio data is transmitted between the audio processing unit 2223 and the audio codec 2232 via a communication link. The audio processing pipeline outputs data to the A/V port 2240 for reproduction by an external audio player or device having audio capabilities.
The front panel I/O subassembly 2230 supports being exposed on the multimedia console 1104Power button 2250 and eject button 2252, as well as the function of any LEDs (light emitting diodes) or other indicators on the outer surface of the base. System power supply module 2239 supplies power to the multimedia console 1104The components of (1) supply power. The fan 2238 cools the multimedia console 1104The circuitry within.
Multimedia console 1104The CPU 2201, GPU2208, memory controller 2210, and various other components are interconnected via one or more buses, including serial and parallel buses, a memory bus, a peripheral bus, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures may include a Peripheral Component Interconnect (PCI) bus, a PCI-Express bus, and the like.
When the multimedia console 1104At power-on, application data may be loaded from system memory 2243 into memory 2212 and/or caches 2202 and 2204, and executed on CPU 2201. The application may be navigating to the multimedia console 1104And present a graphical user interface that provides a consistent user experience across the different media types available. In operation, applications and/or other media contained within the media drive 2244 may be driven from the media driveThe device 2244 initiates or plays to provide additional functionality to the multimedia console 1104。
Multimedia console 1104The system can be operated as a standalone system by simply connecting the system to a television or other display. In this standalone mode, the multimedia console 1104Allowing one or more users to interact with the system, watch movies, or listen to music. However, with the integration of broadband connectivity made possible through the network interface controller 2224, the multimedia console 1104May further operate as a participant in a larger network community.
When the multimedia console 1104When powered on, a set amount of hardware resources may be reserved for use by the multimedia console operating system as a system. These resources may include a reserve of memory (such as 16MB), a reserve of CPU and GPU cycles (such as 5%), a reserve of network bandwidth (such as 8kbs), and so on. Because these resources are reserved at system boot time, the reserved resources do not exist from the application's perspective.
In particular, the memory reservation is preferably large enough to contain the launch kernel, concurrent system applications and drivers. The CPU reservation is preferably constant so that if the reserved CPU usage is not used by the system applications, the idle thread will consume any unused cycles.
For GPU reserves, lightweight messages (e.g., popups) generated by system applications are displayed that schedule code to render popup into an overlay using a GPU interrupt. The amount of memory required for the overlay depends on the overlay area size, and the overlay preferably scales with the screen resolution. Where the concurrent system application uses a full user interface, it is preferable to use a resolution that is independent of the application resolution. A sealer may be used to set this resolution so that the need to change the frequency and cause TV resynchronization is eliminated.
At the multimedia console 1104After booting and system resources are reserved, concurrent system applications execute to provide system functionality. System function is sealedInstalled in a set of system applications executing within the reserved system resources. The operating system kernel identifies the thread as a system application thread rather than a game application thread. The system applications are preferably scheduled to run on the CPU 2201 at predetermined times and intervals in order to provide a consistent view of the system resources to the application. The scheduling is to minimize cache disruption for the gaming application running on the console.
When the concurrent system application requires audio, audio processing is asynchronously scheduled to the gaming application due to time sensitivity. A multimedia console application manager (described below) controls the audio level (e.g., mute, attenuate) of the gaming application when system applications are active.
Input devices (e.g., controllers 2242(1) and 2242(2)) are shared by the gaming application and the system application. The input devices are not reserved resources, but are to be switched between the system application and the gaming application so that each will have a focus of the device. The application manager preferably controls the switching of input stream without knowledge of the gaming application's knowledge, and the driver maintains state information regarding focus switches.
Various exemplary embodiments of the digital assistant extensibility to third party applications of the present invention are now presented by way of illustration rather than as an exhaustive list of all embodiments. One example includes a method for enabling extensibility of a digital assistant operating on a device to one or more applications, comprising: configuring an interface for interoperation with application-specific services exposed by an extension associated with a respective one of the applications; receiving an input from a device user; mapping the device user input to the extension for processing; and receiving the application-specific service from the extension in response to the device user input.
In another example, the method further includes rendering the application-specific service such that the cross-application user experience is exposed to the device user as a native digital assistant user experience and whereby the application-specific service increases the size of the answer database available to the digital assistant. In another example, the method further comprises using the context data in performing the mapping. In another example, the context data includes one or more of: time/date, location of the user or device, language, schedule, applications installed on the device, user preferences, user behavior, user activity, stored contacts, call history, messaging history, browsing history, device type, device capabilities, or communication network type. In another example, the method further includes providing extensibility services to the application, the extensibility services including one or more of a language service, a vocabulary service, a user preference service, or a context service. In another example, the method further includes receiving portions of the extensibility service from a remote service provider. In another example, the method further includes supporting interfacing with an extensibility client configured for interacting with a remote service provider. In another example, the method further includes loading an application-specific resource from a manifest included in the application extension, the application-specific resource including at least a keyword registered with the digital assistant. In another example, the application extension also includes logic for implementing a user experience or user interface using the digital assistant. In another example, the method further comprises configuring the digital assistant in response to a voice input, a gesture input, or a manual input for performing at least one of: sharing contact information, sharing stored contacts, scheduling a meeting, viewing a user's calendar, scheduling a reminder, making a phone call, operating a device, playing a game, making a purchase, taking notes, scheduling an alarm or wake up reminder, sending a message, checking social media for updates, crawling a website, interacting with a search service, sharing or displaying a file, sending a link to a website, or sending a link to a resource.
Yet another example includes an apparatus comprising: one or more processors; a User Interface (UI) for interacting with a user of the device using graphics and audio; and a memory device storing code associated with one or more applications and computer-readable instructions that, when executed by one or more processors, perform a method comprising: exposing a digital assistant on a device for maintaining context awareness for the device by monitoring user behavior and interactions with the device, the digital assistant also interacting with the device user through a UI using voice interactions, receiving input from the device user through the UI, delivering the input to an extension of an application for processing using the context awareness, the application extension configured to deliver a service from the application into a user experience that can be rendered by the digital assistant, and operating the digital assistant to render the service to the device user through the UI.
In another example, the device further includes exposing one or more extensibility services to the application extension. In another example, the apparatus further comprises enabling the application extension to load application-specific resources from the manifest into the runtime environment for execution. In another example, the application extension includes an event handler. In another example, the application extension includes logic that includes one of a script or a programming construct. In another example, the apparatus further includes exposing, using the application extension, one or more databases associated with the application to the digital assistant.
Yet another example includes one or more computer-readable memory devices storing instructions that, when executed by one or more processors located in a computer server, perform a method comprising: interoperating with a digital assistant extensibility client on a local device, the digital assistant extensibility client exposing Application Programming Interfaces (APIs) to one or more application extensions executable on the device, each of the application extensions configured to deliver services from a respective application into a user experience renderable by the digital assistant; maintaining digital assistant extensibility services, comprising at least one of: i) a language service that enables the application to use one or more different languages when rendering the user experience on the local device, ii) a vocabulary service that enables the application to process unknown words or phrases when rendering the user experience, iii) a user preference service that enables the application to employ user preferences maintained by the digital assistant, or iv) a context service that enables the application to utilize context awareness when delivering the service; and providing the digital assistant extensibility service to the one or more application extensions through an API exposed by the digital assistant extensibility client on the local device.
In another example, a digital assistant extensibility service and a digital assistant extensibility client provide a platform that supports a user experience that can be rendered on a local device as a native digital assistant experience across all applications. In another example, the application extension includes application-specific resources that are written to a manifest that is loaded into the runtime environment. In another example, the application extension is authored by a third party developer.
Based on the foregoing, it should be appreciated that technologies for digital assistant extension have been disclosed herein. Although the subject matter described herein has been described in language specific to computer structural features, methodological and transformative acts, specific computing machines, and computer readable storage media, it is to be understood that the invention defined in the appended claims is not necessarily limited to the specific features, acts, or media described herein. Rather, the specific features, acts and mediums are disclosed as example forms of implementing the claims.
The above-described subject matter is provided by way of illustration only and should not be construed as limiting. Various modifications and changes may be made to the subject matter described herein without departing from the true spirit and scope of the present invention, which is set forth in the appended claims, and without necessarily following the example embodiments and applications illustrated and described.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:多任务环境中的智能数字助理