Visual animation display method and related equipment
1. A visual animated display method comprising:
responding to an audio selection instruction of an audio playing interface, and determining target audio information to be played currently;
determining target visual key information that matches the target audio information, wherein the target visual key information comprises keywords that are present in the target audio information and that are suitable for visual presentation;
acquiring a target visual animation matched with the target visual key information, wherein the target visual animation is used for presenting keywords in the target visual key information in a visual mode;
and in the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface.
2. The method of claim 1, further comprising:
acquiring song keywords in a sample song;
obtaining a keyword label corresponding to the song keyword;
acquiring a designed visual animation;
and obtaining the mapping relation between the keyword label and the corresponding visual animation.
3. The method of claim 2, wherein the target audio information comprises a target song, the target visual key information comprises a target song name keyword, and the target visual animation comprises a target song name visual animation; wherein the method further comprises:
obtaining a target song name of the target song;
matching the target song name with the song keywords, and determining the target song name keywords and keyword labels thereof from the song keywords;
determining a target song name visual animation corresponding to the keyword label of the target song name keyword from the visual animation according to the mapping relation;
in the process of playing the target audio information, displaying a target visual animation matched with the target visual key information on the audio playing interface, wherein the process comprises the following steps:
and displaying the target song name visual animation on the audio playing interface from the beginning of playing the target song to the playing time of the song name animation.
4. The method of claim 2, wherein the target audio information comprises a target song, the target visual key information comprises a target song cover keyword, and the target visual animation comprises a target song cover visual animation; wherein the method further comprises:
displaying a target song cover of the target song on the audio playing interface;
performing optical character recognition on the cover of the target song to obtain a character recognition result of the cover of the target song;
matching the character recognition result of the target song cover with the song keywords, and determining the target song cover keywords and keyword labels thereof from the song keywords;
determining a target song cover visual animation corresponding to the keyword label of the target song cover keyword from the visual animation according to the mapping relation;
in the process of playing the target audio information, displaying a target visual animation matched with the target visual key information on the audio playing interface, wherein the process comprises the following steps:
and displaying the target song cover visual animation on the audio playing interface from the beginning of playing the target song to the playing time of the cover animation.
5. The method of claim 2, wherein the target audio information comprises a target song list, the target visual key information comprises a target song list keyword, and the target visual animation comprises a target song list visual animation; wherein the method further comprises:
obtaining a target singing bill subject term of the target singing bill;
matching the target menu subject terms with the song keywords, and determining the target menu keywords and keyword labels thereof from the song keywords;
determining a target song list visual animation corresponding to the keyword label of the target song list keyword from the visual animation according to the mapping relation;
in the process of playing the target audio information, displaying a target visual animation matched with the target visual key information on the audio playing interface, wherein the process comprises the following steps:
and displaying the target song list visual animation on the audio playing interface from the beginning of playing the songs in the target song list to the playing time of the song list animation.
6. The method of claim 2, wherein the target audio information comprises a target song, the target visual key information comprises a target lyrics keyword, and the target visual animation comprises a target lyrics visual animation; wherein the method further comprises:
acquiring target lyrics of the target song;
matching the target lyrics with the song keywords, and determining the target lyrics keywords and keyword labels thereof from the song keywords;
determining a target lyric visual animation corresponding to the keyword label of the target lyric keyword from the visual animation according to the mapping relation;
in the process of playing the target audio information, displaying a target visual animation matched with the target visual key information on the audio playing interface, wherein the process comprises the following steps:
and displaying the target lyric visual animation on the audio playing interface from the playing of the target lyric line in which the target lyric keyword is positioned to the playing time of the lyric animation.
7. The method of claim 6, wherein displaying the target lyric visual animation in the audio playback interface from playing to a target lyric line where the target lyric keyword is located to within a lyric animation playing duration comprises:
and if at least two different target lyric keywords exist in the target lyric line, displaying a target lyric visual animation corresponding to the target lyric keyword matched for the first time in the target lyric line on the audio playing interface in the process of playing the target lyric line.
8. The method of claim 6, wherein displaying the target lyric visual animation in the audio playback interface from playing to a target lyric line where the target lyric keyword is located to within a lyric animation playing duration comprises:
identifying a predetermined number of words in the target lyric line forward, starting from the target lyric keyword;
matching the predetermined number of word or word negative word determining tables to obtain matched negative words;
and if the number of the matched negative words is an even number, displaying the target lyric visual animation on the audio playing interface within the time length from the time of playing to the time of playing the lyric animation.
9. The method of claim 6, wherein obtaining target lyrics for the target song comprises:
before starting playing the target song, obtaining target lyrics of the target song; or
When the target song is played, displaying and obtaining the target lyric on the audio playing interface; or
When the target song is played, obtaining the target song audio of the target song;
and carrying out voice recognition on the target song audio to obtain the target lyrics.
10. The method according to any one of claims 3 to 9, wherein the target visual key information comprises target song keywords; in the process of playing the target audio information, displaying a target visual animation matched with the target visual key information on the audio playing interface, wherein the process comprises the following steps:
matching the target song with the song keywords, and determining the target song keywords from the song keywords;
if the same target song key word is repeatedly identified from the target song, in the process of playing the target song, when the same target song key word is identified for the first time, the target visual animation matched with the same target song key word is displayed on the audio playing interface.
11. The method of claim 1, wherein the target audio information comprises a target song; wherein the method further comprises:
processing the target song through an emotion classifier, and determining a target emotion classification result of the target song;
acquiring target object characteristic information triggering the audio selection instruction;
determining a target background image, a target material and a target animation effect according to the target emotion classification result and the target object characteristic information;
and generating the target visual animation according to the target background picture, the target material and the target animation effect.
12. The method of claim 11, further comprising:
acquiring a training data set, wherein the training data set comprises sample lyrics and emotion labels thereof;
extracting the text emotional characteristics of the sample lyrics through a machine learning model;
processing the text emotional characteristics of the sample lyrics by using a cyclic neural network model to obtain the global text characteristics of the text emotional characteristics;
processing the global text features through a text convolution neural network model to extract local text features of the text emotion features, and obtaining predicted emotion classification results of the sample lyrics according to the local text features;
and training the cyclic neural network model and the text convolutional neural network model according to the predicted emotion classification result and the emotion label thereof to obtain the emotion classifier.
13. A visual animated display device comprising:
the target audio information determining unit is used for responding to an audio selection instruction of the audio playing interface and determining the target audio information to be played currently;
a target visual key information determination unit, configured to determine target visual key information that matches the target audio information, where the target visual key information includes a keyword that is present in the target audio information and is suitable for being visually presented;
a target visual animation obtaining unit, configured to obtain a target visual animation matched with the target visual key information, where the target visual animation is used to visually present a keyword in the target visual key information;
and the target visual animation display unit is used for displaying the target visual animation matched with the target visual key information on the audio playing interface in the process of playing the target audio information.
14. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 12.
15. An electronic device, comprising:
at least one processor;
a storage device configured to store at least one program that, when executed by the at least one processor, causes the at least one processor to implement the method of any one of claims 1 to 12.
Background
Music playing functions have become essential on various terminal devices, such as mobile phones, computers, car machines (short for vehicle-mounted information entertainment products installed in automobiles), and the like, and lyrics functions in music players have become one of essential core functions, so that many users have a habit of listening to songs and watching lyrics.
However, in the prior art, only one static background picture is displayed when the lyrics are displayed, and the background picture is fixed, the display function is single, and the interactivity with the user is monotonous.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure.
Disclosure of Invention
The embodiment of the disclosure provides a visual animation display method and device, a computer-readable storage medium and an electronic device, which can solve the technical problem of monotonous interactivity when audio resources are played in the related art.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
The embodiment of the disclosure provides a visual animation display method, which comprises the following steps: responding to an audio selection instruction of an audio playing interface, and determining target audio information to be played currently; determining target visual key information that matches the target audio information, wherein the target visual key information comprises keywords that are present in the target audio information and that are suitable for visual presentation; acquiring a target visual animation matched with the target visual key information, wherein the target visual animation is used for presenting keywords in the target visual key information in a visual mode; and in the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface.
The disclosed embodiments provide a visual animation display device, the device comprising: the target audio information determining unit is used for responding to an audio selection instruction of the audio playing interface and determining the target audio information to be played currently; a target visual key information determination unit, configured to determine target visual key information that matches the target audio information, where the target visual key information includes a keyword that is present in the target audio information and is suitable for being visually presented; a target visual animation obtaining unit, configured to obtain a target visual animation matched with the target visual key information, where the target visual animation is used to visually present a keyword in the target visual key information; and the target visual animation display unit is used for displaying the target visual animation matched with the target visual key information on the audio playing interface in the process of playing the target audio information.
The disclosed embodiments provide a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements a visual animation display method as described in the above embodiments.
An embodiment of the present disclosure provides an electronic device, including: at least one processor; a storage device configured to store at least one program that, when executed by the at least one processor, causes the at least one processor to implement the visual animated display method as described in the above embodiments.
According to an aspect of the present disclosure, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the visual animated display method provided in the various alternative implementations of the above-described embodiments.
In the technical solutions provided in some embodiments of the present disclosure, on one hand, in response to an audio selection instruction for an audio playing interface, target audio information to be currently played can be determined, and the target audio information is associated with target visual key information, so that in the process of playing the target audio information, a target visual animation matched with the target visual key information can be displayed on the audio playing interface, and by creating a visual egg-colored effect for a user, playability and interest are improved, interactivity between the user and the user listening to the audio resources is improved, an atmosphere sense of the user listening to the audio resources is enhanced, a visual freshness sense of the user in the process of listening to the audio resources is increased, and aesthetic fatigue is reduced. On the other hand, with the playing progress of the target audio information, the target visual animation can appear at variable time, the surprise of the user is increased, and curiosity is caused, so that more use is promoted, and waste of resources is avoided.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:
FIG. 1 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure.
FIG. 2 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure.
FIG. 3 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure.
FIG. 4 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure.
FIG. 5 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure.
Fig. 6 schematically shows an interface schematic of a normal player according to an embodiment of the present disclosure.
Fig. 7 schematically shows a cloud and rain animation effect displayed when playing to the target lyric keyword "raining" in fig. 6.
Fig. 8 schematically shows an interface diagram of the cloud-rain animation effect shown in fig. 7 gradually disappearing and gradually returning to the original player shown in fig. 6 after 20 seconds.
Fig. 9 schematically shows an interface schematic of a normal player according to an embodiment of the present disclosure.
Fig. 10 schematically shows a cherry blossom animation effect diagram displayed when playing to the target lyric keyword "cherry blossom" in fig. 9.
Fig. 11 schematically shows an interface diagram of the fading of the cherry blossom animation effect shown in fig. 10 after 10 seconds, gradually returning to the original player shown in fig. 9.
Fig. 12 schematically shows a schematic diagram of applying the method provided by the embodiment of the present disclosure to a lyric full screen page.
Fig. 13 schematically shows a schematic diagram of applying the method provided by the embodiment of the present disclosure to a song K page.
FIG. 14 schematically shows a flow chart of a method of visual animated display according to an embodiment of the present disclosure.
Fig. 15 schematically illustrates an interface diagram of a target song menu subject term of a target song menu according to an embodiment of the present disclosure.
Fig. 16 schematically shows a cherry blossom animation effect diagram displayed when the song in the target menu shown in fig. 15 is played.
FIG. 17 schematically shows a flow diagram of a method of visual animated display according to an embodiment of the present disclosure.
FIG. 18 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure.
Fig. 19 schematically illustrates a colored egg effect generation template according to an embodiment of the present disclosure.
FIG. 20 schematically illustrates a schematic diagram of the emotional exposure of corresponding egg effects by recognition of lyrics according to an embodiment of the present disclosure.
FIG. 21 schematically illustrates a block diagram of a visual animated display device according to an embodiment of the present disclosure.
FIG. 22 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.
The described features, structures, or characteristics of the disclosure may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.
The drawings are merely schematic illustrations of the present disclosure, in which the same reference numerals denote the same or similar parts, and thus, a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in at least one hardware module or integrated circuit, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and steps, nor do they necessarily have to be performed in the order described. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
In this specification, the terms "a", "an", "the", "said" and "at least one" are used to indicate the presence of at least one element/component/etc.; the terms "comprising," "including," and "having" are intended to be inclusive and mean that there may be additional elements/components/etc. other than the listed elements/components/etc.; the terms "first," "second," and "third," etc. are used merely as labels, and are not limiting on the number of their objects.
The following detailed description of exemplary embodiments of the disclosure refers to the accompanying drawings.
Based on the technical problems in the related art, the embodiments of the present disclosure provide a visual animation display method for at least partially solving the above problems. The method provided by the embodiments of the present disclosure may be executed by any electronic device, for example, a server, or a terminal device, or an interaction between a server and a terminal device, which is not limited in the present disclosure.
The server mentioned in the embodiment of the present disclosure may be an independent server, or may be a server cluster or a distributed system formed by a plurality of servers, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), and a big data and artificial intelligence platform.
In the embodiment of the present disclosure, the terminal device may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a wearable smart device, a car machine, a smart television, or the like, but is not limited thereto. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the disclosure is not limited thereto.
FIG. 1 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure. The embodiment of the present disclosure is illustrated by taking a terminal device as an example, but the present disclosure is not limited thereto.
The visual animation mentioned in the embodiments of the present disclosure refers to an animation effect triggered by any user performing a specified operation, for example, the animation effect can be developed and embedded in an APP (application program), so as to achieve an egg-painting effect.
The colored eggs mentioned in the embodiments of the present disclosure generally refer to a secret hidden in the product by the producer for pleasure to the user, and may include, for example, characters, pictures, videos, sounds or some small changes. Usually when the user performs some specific operation. Such as when the user performs a cryptic command, inadvertent mouse or keyboard manipulation, not described in some user manuals. The purpose of the producer is usually to show the user some material of the producer or to pleasure the user. The eggbeates may be applied to a variety of system platforms, which may include, but are not limited to, android, IOS, Windows, Linux, and the like.
As shown in fig. 1, the method provided by the embodiment of the present disclosure may include the following steps.
In step S110, in response to an audio selection instruction for the audio playing interface, target audio information to be currently played is determined.
For example, a user installs various clients, such as a music player, an audio client, a video client, an instant messaging client, an education client, a travel client, and the like, on a terminal device of the user, when the user opens an audio playing interface (any interface capable of playing audio) of any client installed on the terminal device of the user, an audio information list can be displayed on the audio playing interface, the user can select any one or more audio information in the audio information list to trigger an audio selection instruction, and a target audio information which the user currently wants to play can be determined according to the audio selection instruction.
The audio information and the target audio information in the embodiments of the present disclosure may be any information that is played in an audio manner, such as a lecture, a song, news, and the like. In the following embodiments, the audio information is taken as a song and the target audio information is taken as a target song for illustration, but the disclosure is not limited thereto.
In step S120, target visual key information matching the target audio information is determined, wherein the target visual key information comprises keywords present in the target audio information and adapted to be visually presented.
In the embodiment of the present disclosure, the visual key information refers to information that can be presented in a visual form, such as some keywords related to strong visual intention, such as "cherry blossom", "rain", and the like, as exemplified below. The target visual key information matched with the target audio information means that information contained in the target audio information, such as the title of the target song, the cover of the song, the lyrics, etc., is matched with the target visual key information. For example, if "cherry blossom" appears in the lyrics of a certain song, the associated target visual key information may be "cherry blossom", and the cherry blossom may be displayed in the form of image, animation, or the like of the cherry blossom.
In step S130, a target visual animation matching the target visual key information is obtained, and the target visual animation is used for visually presenting the keywords in the target visual key information.
In some embodiments, when the terminal device determines target audio information to be currently played and target visual key information matched with the target audio information, the terminal device may send a request for acquiring the target audio information to the server, if a target visual animation matched with the target visual key information is preset in the target audio information, the target audio information and the target visual animation may be cached locally in the terminal device while the target audio information is pulled from the server, and the target visual animation may be displayed at a corresponding position when the target audio information is subsequently played, so that fluency of the target visual animation may be improved, and synchronization performance between playing of the target audio information and displaying of the target visual animation may be better.
In other embodiments, when the target audio information to be currently played is determined at the terminal device, the terminal device may send a request for obtaining the target audio information to the server, the server returns the corresponding target audio information to the terminal device, the terminal device then determines the target visual key information matched with the target audio information, then, when the target audio information is played, the server pulls the corresponding target visual animation according to the target visual key information, and when the target audio information is played, the target visual animation is displayed at the corresponding position, so that the storage space occupied at the section of the terminal device may be reduced.
In step S140, in the process of playing the target audio information, a target visual animation matched with the target visual key information is displayed on the audio playing interface.
In the embodiment of the disclosure, a target visual animation matched with the target visual key information can be obtained according to the target visual key information, and the target visual animation can be simultaneously displayed on the audio playing interface in the process of playing the target audio information, such as a target song.
For example, the animation effect of the colored eggs can be realized by playing a plurality of pictures matched with the target visual key information at continuous intervals in a short time (usually in the order of milliseconds, such as 100 milliseconds and 50 milliseconds).
In an exemplary embodiment, the method may further include: acquiring song keywords in a sample song; obtaining a keyword label corresponding to the song keyword; acquiring a designed visual animation; and obtaining the mapping relation between the keyword label and the corresponding visual animation.
Taking songs as an example, a large number of songs can be collected as sample songs, some keywords which can be displayed in a visual form are extracted from the sample songs in advance to serve as song keywords, corresponding keyword labels are distributed to the song keywords, and the keyword labels refer to identifiers which can uniquely distinguish each song keyword from other song keywords and are convenient for computer identification.
In some embodiments, the target visual animation may be pre-authored by a developer (e.g., animator, etc.) based on native code of the respective system platform (e.g., android platform, IOS platform, etc.).
Then, a mapping relationship is established between the keyword tag corresponding to each song keyword and the corresponding visual animation, for example, assuming that a certain song keyword is "cherry blossom", and a keyword tag is "1" (for illustration only), and a visual animation related to cherry blossom is designed, then the mapping relationship between the keyword tag "1" of the song keyword of "cherry blossom" and the visual animation related to cherry blossom can be stored in the database, so as to be used for finding out the target visual animation matched therewith according to the target visual key information.
In an exemplary embodiment, the target audio information may include a target song, the target visual key information may include a target song name keyword, and the target visual animation may include a target song name visual animation.
Wherein the method may further comprise: obtaining a target song name of the target song; matching the target song name with the song keywords, and determining the target song name keywords and keyword labels thereof from the song keywords; and determining the target song name visual animation corresponding to the keyword label of the target song name keyword from the visual animation according to the mapping relation.
In the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface may include: and displaying the target song name visual animation on the audio playing interface from the beginning of playing the target song to the playing time of the song name animation.
In an exemplary embodiment, the target audio information comprises a target song, the target visual key information comprises a target song cover keyword, and the target visual animation comprises a target song cover visual animation.
Wherein the method may further comprise: displaying a target song cover of the target song on the audio playing interface; performing optical character recognition on the cover of the target song to obtain a character recognition result of the cover of the target song; matching the character recognition result of the target song cover with the song keywords, and determining the target song cover keywords and keyword labels thereof from the song keywords; and determining the target song cover visual animation corresponding to the keyword label of the target song cover keyword from the visual animation according to the mapping relation.
In the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface may include: and displaying the target song cover visual animation on the audio playing interface from the beginning of playing the target song to the playing time of the cover animation.
In an exemplary embodiment, the target audio information may include a target song list, the target visual key information may include a target song list keyword, and the target visual animation may include a target song list visual animation.
In an exemplary embodiment, the target audio information may include a target song, the target visual key information may include a target lyric keyword, and the target visual animation may include a target lyric visual animation.
Wherein the method may further comprise: acquiring target lyrics of the target song; matching the target lyrics with the song keywords, and determining the target lyrics keywords and keyword labels thereof from the song keywords; and determining the target lyric visual animation corresponding to the keyword label of the target lyric keyword from the visual animation according to the mapping relation.
In the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface may include: and displaying the target lyric visual animation on the audio playing interface from the playing of the target lyric line in which the target lyric keyword is positioned to the playing time of the lyric animation.
In the disclosed embodiment, assuming that a song will have a song name, a song cover, and lyrics, the corresponding target song will include a target song name, a target song cover, and target lyrics.
In identifying the target visual key information of the target song, the priority of identification may be set, and it is assumed in the following embodiments that the identification priority is set to: the identification priority of the target song name of the target song is larger than the cover of the target song; the identification priority of the cover of the target song is greater than the target lyrics of the target song.
Firstly, searching whether a target song name keyword matched with the pre-extracted song keyword exists in a target song name, and if the target song name keyword is searched, preferentially playing a target song name visual animation corresponding to the target song name keyword.
Whether the target song name keyword is found in the target song name or not, whether the target song name keyword is matched with the song keyword or not can be continuously found in the target song cover, and if the target song cover keyword is found, the target song cover visual animation corresponding to the target song cover keyword can be played preferentially (if the target song name visual animation exists) or after the target song name visual animation is shown.
Whether the target song cover key word is found in the target song cover or not, whether the target song cover key word matched with the song key word exists in the target lyrics or not can be continuously found, if the target lyrics key word is found, the target lyrics visual animation corresponding to the target lyrics key word can be played after the target lyrics visual animation and/or the target song cover visual animation is shown (if the target lyrics visual animation and/or the target song cover visual animation exist), or directly when the target lyrics line where the target lyrics key word exists is played (if the target lyrics visual animation and the target song cover visual animation do not exist). That is, one or more (two or more) target visual animations may be presented at intervals during the playing of a target song.
In some embodiments, it is also possible that some of the target songs do not have a song cover, and then the steps associated with the target song cover are not processed.
It is to be understood that the above identification priority is only for illustration, and the present disclosure is not limited thereto, for example, the identification priority of the target lyrics may be set to be greater than the target song cover, and the identification priority of the target song cover may be greater than the target song name; alternatively, the identification priority of the cover of the target song is greater than the target lyrics, and the identification priority of the target lyrics is greater than the target song name, and so on.
In other embodiments, only one or both of a target lyric keyword in target lyrics of a target song, or a target song cover keyword in a target song cover, or a target song name keyword in a target song name may also be identified as target visual key information.
On one hand, the visual animation display method provided by the embodiment of the disclosure can determine the target audio information to be played currently in response to an audio selection instruction of an audio playing interface, and the target audio information is associated with the target visual key information, so that in the process of playing the target audio information, the target visual animation matched with the target visual key information can be displayed on the audio playing interface, and by making a visual egg-colored effect for a user, the playability and the interest are improved, the interactivity between the user and the user listening to the audio resources is improved, the atmosphere sense of the user listening to the audio resources is enhanced, the visual freshness sense of the user in the process of listening to the audio resources is increased, and the aesthetic feeling of the user listening to the audio resources is reduced. On the other hand, with the playing progress of the target audio information, the target visual animation can appear at variable time, the surprise of the user is increased, and curiosity is caused, so that more use is promoted, and waste of resources is avoided.
In the embodiments of fig. 2-4, the recognition priority is set to be higher than the recognition priority of the target song name and the recognition priority of the target song cover is higher than the target lyrics.
FIG. 2 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure. In the fig. 2 embodiment, the target audio information may include a target song, the target visual key information may include a target song name keyword, and the target visual animation may include a target song name visual animation.
As shown in fig. 2, the method provided by the embodiment of the present disclosure may include the following steps.
In step S201, a song keyword in the sample song is acquired.
In step S202, a keyword tag corresponding to the song keyword is obtained.
In the embodiment of the disclosure, partial keywords with strong visual images can be selected from the song names, song covers, lyrics and the like of sample songs as song keywords, and then the song keywords are classified and corresponding keyword labels are set to establish a keyword label library.
For example, the keyword tag library may include a classification of the following song keywords:
1. weather types: sunny, rainy, windy, cloudy, snowy, storm, thunder and lightning, windy and beautiful, gentle breeze, rainy weather, rainbow, starry sky, and watery moon
2. Mood class: love, loss of love, happiness, melancholy, difficulty, hurting heart, despair, etc. (need to identify whether there is a negative word in the front)
3. Nature class: cherry blossom, petal, red flower, grassland, mountain, river, sea, etc
Further, a negative word list may be set, for example, the following negative words may be included: no, unable, unwanted, no, etc. For further judging whether to express a negative meaning or a positive meaning for the located song keyword.
In step S203, a visual animation of the design is acquired.
In step S204, a mapping relationship between the keyword tag and the corresponding visual animation is obtained.
In the embodiment of the disclosure, the visual animation with the corresponding style is correspondingly designed, and the mapping relation is constructed between the keyword tag of the song keyword and the corresponding visual animation, so that when the target visual key information is identified in the target song, the target visual animation can be correspondingly determined.
In step S205, in response to the audio selection instruction to the audio playing interface, a target song to be played currently is determined. This step can be referred to the fig. 1 embodiment described above.
In step S206, it is determined whether the screen is in a lock state; if not, the step S207 is executed; if the screen is in the lock screen state, the process goes to step S211.
The screen locking state in the embodiment of the present disclosure refers to a state in which an audio playing interface is not displayed on a screen of the terminal device. For example, although some users open a music player and select a target song for playing, in the process of listening to a song, the users open a browser on the terminal device to browse a webpage, or open other software to work, or the users do not look at the screen during chatting or in lunch break, at this time, the terminal device will automatically enter a screen locking state, and at this time, the target visual animation is not required to be displayed.
In step S207, a target song name of the target song is obtained.
In the embodiment of the disclosure, a target song name of a target song is obtained first, and the target song name is identified and matched with a song keyword.
In step S208, the target song title and the song keywords are matched, and the target song title keyword and the keyword tag thereof are determined from the song keywords.
For example, if the target song name of the target song is "all rainy days" and the song keyword "rains" in the weather class in the keyword tag library is matched with the target song name, the "raining" is used as the target song name keyword, and the keyword tag of the target song name keyword can be obtained correspondingly at this time.
In step S209, a target song title visual animation corresponding to the keyword tag of the target song title keyword is determined from the visual animation according to the mapping relationship.
For example, based on the keyword tag of the target song title keyword "rainy", the visual animation related to rain designed can be determined from the stored mapping relationship as the target song title visual animation.
In step S210, the target song title visual animation is displayed on the audio playing interface from the beginning of playing the target song to the playing time of the song title animation.
For example, when the target song name "all rainy days" of the target song includes the target song name keyword "rainy", the target song name visual animation related to rainy days may be displayed on the audio playing interface within a preset song name animation playing time duration, for example, 20 seconds (which may be set according to practical situations, and this disclosure is not limited thereto), when the target song name "all rainy days" of the target song starts to be played.
The same or different song title animation playing time lengths can be set for different target songs, for example, different song title animation playing time lengths can be set according to the music rhythm, the lyric style, different target song title keywords and the like of each target song.
It is to be understood that, although the example of starting to display the target song name visual animation when the target song starts to be played is taken as an example, the disclosure is not limited thereto, and the target song name visual animation may also be displayed on the audio playing interface when the target song is about to be played or in an intermediate stage of playing the target song, for example.
In step S211, the target visual animation is not displayed.
According to the visual animation display method provided by the embodiment of the disclosure, different text information can be contained in target song names, different target song names can bring different psychological feelings to users, a series of visual animations conforming to the mood of song keywords are configured in advance, then corresponding keyword labels are printed on specific vocabularies in the target song names, when the target songs are played, the target song name visual animations corresponding to the keyword labels can be displayed in audio playing interfaces such as a music player/lyrics page/music APP home page and the like, so that the visual egg-shaped effect matched with the target song name keywords is produced, and the atmosphere and surprise of the users listening to music are increased.
FIG. 3 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure. The target audio information in the embodiment of fig. 3 may include a target song, the target visual key information may include a target song cover keyword, and the target visual animation may include a target song cover visual animation.
As shown in fig. 3, the method provided by the embodiment of the present disclosure may include the following steps.
In step S301, a target song cover of the target song is displayed on the audio playing interface.
In the embodiment of the disclosure, after the target song title of the target song is identified and matched with the song keyword, the target song cover can be continuously obtained, and the target song cover is identified and matched with the song keyword no matter whether the target song title keyword is matched.
Related information such as a singing theme, a singer and a singing version of the target song can be included on the cover of the target song.
In step S302, optical character recognition is performed on the cover of the target song to obtain a text recognition result of the cover of the target song.
In the embodiment of the present disclosure, an OCR (Optical Character Recognition), which is a process of converting fonts, drawings, or scene texts in an electronic image into machine-coded texts), may be used to recognize the characters in the cover of the target song in an image format, so as to obtain the Character Recognition result.
In step S303, the text recognition result of the target song cover is matched with the song keywords, and the target song cover keywords and the keyword tags thereof are determined from the song keywords.
In step S304, a target song cover visual animation corresponding to the keyword tag of the target song cover keyword is determined from the visual animations according to the mapping relationship.
In step S305, the target song cover visual animation is displayed on the audio playing interface from the beginning of playing the target song to the time length of playing the cover animation.
In the embodiment of the present disclosure, the cover animation playing time length may be set according to actual situations, for example, 10 seconds, but the present disclosure is not limited thereto. The same or different cover animation playing time lengths can be set for different target songs, for example, different cover animation playing time lengths can be set according to the music rhythm, the lyric style, different target song cover keywords and the like of each target song.
It is to be understood that, although the target song cover visual animation is shown as an example when the target song starts to be played, the present disclosure is not limited thereto, and the target song cover visual animation may be shown on the audio playing interface when the target song is about to be played or during an intermediate stage of playing the target song.
According to the visual animation display method provided by the embodiment of the disclosure, different text information can be contained in the target song cover, different target song covers can bring different psychological feelings to a user, a series of visual animations conforming to the mood of the song keywords are configured in advance, then corresponding keyword tags are printed on specific vocabularies in the target song cover, when the target song starts to be played, the visual animation of the target song cover corresponding to the keyword tags can be displayed in audio playing interfaces such as a music player/lyric page/music APP front page and the like, so that a visual egg-like effect matched with the keywords of the target song cover is manufactured, and the atmosphere and the surprise of the user listening to the music are increased.
FIG. 4 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure. In the FIG. 4 embodiment, the target audio information may include a target song, the target visual key information may include a target lyric keyword, and the target visual animation may include a target lyric visual animation.
As shown in fig. 4, the method provided by the embodiment of the present disclosure may include the following steps.
In step S401, target lyrics of the target song are acquired.
In the embodiment of the disclosure, after the target song name and the target song cover of the target song are identified and matched with the song keywords, the target lyrics can be continuously obtained, identified and matched with the song keywords regardless of whether the target song name keywords and the target song cover keywords are matched.
In an exemplary embodiment, obtaining the target lyric of the target song may include: before starting playing the target song, obtaining target lyrics of the target song; or when the target song is played, the target lyrics are displayed and obtained on the audio playing interface; or when the target song is played, obtaining the target song audio of the target song; and carrying out voice recognition on the target song audio to obtain the target lyrics.
In the embodiment of the present disclosure, a plurality of technical means may be adopted to obtain the target lyrics, the target song name and the target song cover, such as one or more of a text recognition technology, a voice recognition technology, a picture recognition technology, and the like.
For example, a character recognition technology may be adopted, before starting playing a target song, or during playing the target song, the target lyric of the target song is displayed on a screen, then a specific character (a character matching with the song keyword) in the target song name and the target lyric is recognized, a corresponding keyword tag is marked, the target song starts playing in a music player, or when a target lyric line where the target lyric keyword is located is played, a target visual animation corresponding to the keyword tag is called for display.
For another example, a voice recognition technology may also be used, when the target song is played, the content of the song keyword contained in the lyric (target song audio) is recognized as the target lyric keyword, and the target visual animation corresponding to the corresponding keyword tag is called for display.
In step S402, the target lyrics are matched with the song keywords, and the target lyrics keywords and their keyword tags are determined from the song keywords.
In step S403, a target lyric visual animation corresponding to the keyword tag of the target lyric keyword is determined from the visual animation according to the mapping relationship.
In step S404, the target lyric visual animation is displayed on the audio playing interface from the playing of the target lyric line where the target lyric keyword is located to the playing time of the lyric animation.
In the embodiment of the present disclosure, the lyric animation playing time length may be set according to actual conditions, for example, 10 seconds, but the present disclosure is not limited thereto. The same or different lyric animation playing time lengths can be set for different target songs, for example, different lyric animation playing time lengths can be set according to the music rhythm, the lyric style, different target lyric keywords and the like of each target song.
It is to be understood that, although the target lyric visual animation is shown in the process of playing to the target lyric line, the disclosure is not limited thereto, and the target lyric visual animation may also be shown in the audio playing interface, for example, when the target song is about to be played, or in an intermediate stage of playing the target song, or in the process of playing the target lyric line and two lines of lyrics above and below the target lyric line.
It should be noted that the target song name keyword, the target song cover keyword and the target lyric keyword in the embodiment of the disclosure may be matched with the same song keyword or different song keywords, and the corresponding target song name visual animation, target song cover visual animation and target lyric visual animation may also be the same visual animation or different visual animations, which is not limited in the disclosure. Similarly, the song title animation playing time length, the song cover animation playing time length, and the lyric animation playing time length may be set to be the same or different.
In an exemplary embodiment, displaying the target lyric visual animation in the audio playing interface from the beginning of playing to the target lyric line where the target lyric keyword is located to the playing time length of the lyric animation may include: and if at least two different target lyric keywords exist in the target lyric line, displaying a target lyric visual animation corresponding to the target lyric keyword matched for the first time in the target lyric line on the audio playing interface in the process of playing the target lyric line.
In the embodiment of the present disclosure, if a plurality of (two or more) different target lyric keywords are simultaneously recognized in the same target lyric line (the same lyric) of the target lyric, the target lyric keyword recognized first in the same target lyric line may be set as the criterion, and in the whole process of playing the target lyric line, only the target lyric visual animation corresponding to the target lyric keyword recognized first is displayed on the audio playing interface. Therefore, the display mode of the target visual animation when a plurality of song keywords are matched is determined, the resource waste caused by repeatedly switching different target visual animations in the process of playing a certain song lyric is avoided, the dazzling of a user caused by repeated switching is also avoided, and the user experience is further improved.
For example, if three target lyric keywords of "rain", "cherry blossom" and "ocean" are sequentially recognized in a certain lyric, only the target lyric visual animation related to rain is displayed on the audio playing interface during the process of playing the lyric.
In the embodiment of the present disclosure, if at least two different target song name keywords exist in the target song name, only the target song name visual animation corresponding to the first matched target song name keyword in the target song name may be displayed on the audio playing interface when the target song starts to be played.
In the embodiment of the present disclosure, if at least two different target song cover keywords exist in the target song cover, only the target song cover visual animation corresponding to the target song cover keyword matched for the first time in the target song cover may be displayed on the audio playing interface when the target song starts to be played.
In the embodiment of the present disclosure, if different target song title keywords and target song cover keywords are identified at the same time, only the target song title visual animation corresponding to the target song title keyword matched in the target song titles may be displayed on the audio playing interface when the target song starts to be played. Or, when the target song starts to be played, only the target song cover visual animation corresponding to the target song cover keyword matched in the target song cover may be displayed on the audio playing interface.
In the embodiment of the disclosure, for a first lyric, if a target lyric keyword is identified in the first lyric and different target song name keywords and target song cover keywords are identified, only a target song name visual animation corresponding to the target song name keyword matched in the target song name may be displayed on the audio playing interface in the process of playing the first lyric, or only a target song cover visual animation corresponding to the target song cover keyword matched in the target song cover may be displayed, or only a target lyric visual animation corresponding to the target lyric key may be displayed.
In an exemplary embodiment, displaying the target lyric visual animation in the audio playing interface from the beginning of playing to the target lyric line where the target lyric keyword is located to the playing time length of the lyric animation may include: identifying a predetermined number of words in the target lyric line forward, starting from the target lyric keyword; matching the predetermined number of word or word negative word determining tables to obtain matched negative words; and if the number of the matched negative words is an even number, displaying the target lyric visual animation on the audio playing interface within the time length from the time of playing to the time of playing the lyric animation.
In the embodiment of the disclosure, when a target lyric keyword is located in a target lyric line, n (the predetermined number, n is a positive integer greater than or equal to 1) words can be recognized forward from the target lyric keyword, the n words are respectively matched with a negative word list, whether the number of the matched negative words is an odd number or not is judged, if the number of the matched negative words is the odd number, whether the target lyric keyword expresses a definite meaning or not can be judged, and at this time, a target lyric visual animation can not be displayed on an audio playing interface; if the number is even, the target lyric keyword expression can be judged to be positive meaning, and at the moment, the target lyric visual animation can be displayed on the audio playing interface.
In the embodiment of the disclosure, when a target song name keyword is located in a target song name, n words can be identified forward from the target song name keyword, the n words are respectively matched with a negative word list, whether the number of the matched negative words is an odd number or not is judged, if the number of the matched negative words is the odd number, whether the expression of the target song name keyword is definite or not can be judged, and at the moment, a target song name visual animation can not be displayed on an audio playing interface; if the number is even, the expression of the target song title keyword can be judged to be positive, and at the moment, the target song title visual animation can be displayed on the audio playing interface.
In the embodiment of the disclosure, when a target song cover keyword is located in a text recognition result of a target song cover, n words can be recognized forwards from the target song cover keyword, the n words are respectively matched with a negative word list, whether the number of the matched negative words is an odd number or not is judged, if the number of the matched negative words is the odd number, whether the expression of the target song cover keyword is definite or not can be judged, and at the moment, a target song cover visual animation can not be displayed on an audio playing interface; if the number is even, the target song cover keyword expression can be judged to be positive meaning, and at the moment, the target song cover visual animation can be displayed on the audio playing interface.
By searching the matched negative words, the lyrics, the title and the meaning expressed in the cover of the song can be more accurately identified, the target visual animation which is more accordant with the mood is displayed, the interactivity with the user is further improved, and the waste of resources is avoided.
According to the visual animation display method provided by the embodiment of the disclosure, target lyrics can contain different character information, different target lyrics can bring different psychological feelings to a user, a series of visual animations conforming to the mood of song keywords are configured in advance, then corresponding keyword tags are printed on specific vocabularies in the target lyrics, when the target song starts to be played, the target lyric visual animation corresponding to the keyword tags can be displayed in audio playing interfaces such as a music player/lyric page/music APP top page and the like, so that a visual egg-shaped effect matched with the target lyric keywords is produced, and the atmosphere and surprise of the user listening to music are improved.
In an exemplary embodiment, the target visual key information comprises a target song keyword; in the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface may include: matching the target song with the song keywords, and determining the target song keywords from the song keywords; if the same target song key word is repeatedly identified from the target song, in the process of playing the target song, when the same target song key word is identified for the first time, the target visual animation matched with the same target song key word is displayed on the audio playing interface.
In an embodiment of the present disclosure, the target song keyword may include at least one of the above-mentioned target song name keyword, target song cover keyword, and target lyric keyword.
In the embodiment of the disclosure, if the same target song keyword is identified in the target song name, the target song cover and the target song lyric of the target song, the method may be set to show the corresponding target visual animation only once on the audio playing interface in the whole playing process of the target song when the same target song keyword is identified for the first time, that is, may be set to show each visual animation only once in the whole playing process of the target song. Therefore, the situation that the same target visual animation is repeatedly displayed in the playing process of one song can be avoided, the waste of resources is avoided, and the aesthetic fatigue of a user watching the same target visual animation is also avoided.
In the embodiment of the disclosure, the design can be specifically designed according to special scenes, such as festival songs, singers' birthday operation and the like.
FIG. 5 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure. The embodiment of fig. 5 only exemplifies the method provided by the above embodiment by taking the target lyrics of the target song as an example.
As shown in fig. 5, the method provided by the embodiment of the present disclosure may include the following steps.
In step S501, the system lists keywords suitable for making a color egg design, classifies the keywords, and establishes a keyword tag library.
The keywords suitable for being made into the color egg design are the selected keywords with strong visual intention, and the selected keywords are used as song keywords.
In step S502, a corresponding visual animation is designed.
In step S503, a matching rule of the keyword tag and the visual animation is formulated.
And obtaining and storing the mapping relation between the keyword labels of the song keywords and the corresponding visual animations.
In step S504, the user plays music in the software.
The user opens software which is installed on the terminal equipment and can be used for playing music, such as a music player, can directly click and select a certain song as a target song in a song list displayed on an audio playing interface, can also input information such as singer names, album names and the like in a search input box on the audio playing interface, searches and displays a corresponding song list and clicks and selects to determine the target song.
In step S505, the system rapidly scans the lyrics text content of the currently playing music, and if the target lyrics keyword is extracted from the lyrics text content, a mark is made on the playing timeline.
In the embodiment of fig. 5, assuming that the user immediately displays the target lyrics of the target song on an audio playing interface (referred to as a music playing interface herein) after selecting the target song, before starting playing, the system may quickly scan the content of the target lyrics in the text format to determine whether there is a target lyric keyword matching the preset song keyword.
If the target lyric keyword is identified, a mark can be seated at a corresponding position of the playing time line of the target song.
In step S506, when the music is played to the target lyric keyword mark, the system calls the target visual animation matched with the keyword tag of the target lyric keyword and displays the target visual animation on the music playing interface.
When played to the target lyric keyword mark, a target visual animation matching the keyword tag of the target lyric keyword may be called to be displayed on the music playing interface.
As shown in fig. 6, assuming that the target song name of the target song is "all rainy days", a normal player, i.e., a music player not showing the target visual animation, is assumed to be in fast singing (Quick Sing) mode at this time. The music playing interface of the normal player shows that part of the target lyrics are ' today's good-growing-due-to-date wonderful metamorphism of the life on that day because you spend a rainy day and you have residual heat to warm me … ' after the solitary looks at the rain and both pleasure points.
With continued reference to fig. 6, a download (download) control, an Add (Add to) control, a comment (command) control, and a share (share) control may also be displayed for downloading the target song, adding the target song to the user's favorite song, commenting on the target song by the user, and sharing the target song to other friends by the user, respectively.
When the target lyric of the above-described fig. 6 is recognized as having the target lyric keyword "raining", a mark may be made at the target lyric keyword. As shown in fig. 7, when the target lyric keyword "raining" is played, a corresponding cloud and rain visual animation appears on the screen as the target lyric visual animation to show the cloud and rain animation effect.
As shown in fig. 8, it may be set to allow the cloud-rain animation effect to fade back to the original music player not showing the target visual animation after 20 seconds.
As shown in FIG. 9, assume that the target lyrics include "like cherry blossom to see once every year on the day that the cherry blossom is held, we will see face …".
As shown in fig. 10, when the target lyric is recognized to have the target lyric keyword "cherry blossom" therein, when the target lyric keyword "cherry blossom" is played, a target visual animation related to the cherry blossom appears on the screen to show the cherry blossom animation effect.
As shown in fig. 11, after exhibiting the cherry blossom animation effect for 10 seconds, the cherry blossom animation effect gradually disappears and returns to the original music player not exhibiting the cherry blossom animation effect.
The method provided by the embodiment of the disclosure can also be applied to other scenes.
For example, it can be applied to the lyric full screen page as shown in fig. 12, the target song name "all rainy days" and its partial target lyrics "… showing the target song in the audio playing interface still keeps beautiful summer vacation as if speaking good summer holidays to remember that kind of sweet or as if speaking good today due to the fact that you have a surplus heat to get me warm after you have a happy point when the solitary watch rain after you have a rainy day and let me get warm after you have a happy point when you have a happy day when the solitary watch rain and go up when you have a rainy day and go down to rain just wrong because you have a passion to get a rush from one moment to always take the spring and leave much lower double smiling face …" when the target lyrics keyword such as "rainy" is played, then the cloud animation effect as shown in fig. 7 and 8 above can be displayed.
Referring to FIG. 12, Cover versions (Cover versions) may also be displayed.
For another example, it can be applied to a page of a song K (Karaok, which is a miscellaneous name in english-japanese) as shown in fig. 13. As shown in fig. 13, a feedback (feedback) control may be displayed for the user to provide feedback opinions, a recording (recording) control may be displayed for the user to implement a recording function, a singing part (vocal) control may be displayed, a restart (restart) control, an end (finish) control, and a key (key) control may be displayed.
When a user records a target song by using the control, in the humming process, the target lyric keywords in the hummed target song audio can be identified, so that the corresponding target lyric visual animation is correspondingly displayed on a K song page (audio playing interface).
Wherein the method may further comprise: obtaining a target singing bill subject term of the target singing bill; matching the target menu subject terms with the song keywords, and determining the target menu keywords and keyword labels thereof from the song keywords; and determining the target singing menu visual animation corresponding to the keyword label of the target singing menu keyword from the visual animation according to the mapping relation.
In the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface may include: and displaying the target song list visual animation on the audio playing interface from the beginning of playing the songs in the target song list to the playing time of the song list animation.
FIG. 14 schematically shows a flow chart of a method of visual animated display according to an embodiment of the present disclosure. In the fig. 14 embodiment, the target audio information may include a target song list, the target visual key information may include a target song list keyword, and the target visual animation may include a target song list visual animation.
As shown in fig. 14, the method provided by the embodiment of the present disclosure may include the following steps.
In step S1401, a target menu subject word of the target menu is obtained.
In the embodiment of the present disclosure, the target song list may be a collection of songs collected and categorized by the user, or may be a collection of songs of a certain singer, or a collection of songs of a certain album. Each target song list can have corresponding target song list subject terms which are used for showing common characteristics of songs in the target song list. The target menu title may be, for example, the menu name set for the categorized set of songs.
In step S1402, the target menu keyword is matched with the song keyword, and the target menu keyword and the keyword tag thereof are determined from the song keyword.
In step S1403, a target song list visual animation corresponding to the keyword tag of the target song list keyword is determined from the visual animation according to the mapping relationship.
In step S1404, the target song list visual animation is displayed on the audio playing interface from the beginning of playing the song in the target song list to the playing time of the song list animation.
In the embodiment of the disclosure, if the target song list keyword is identified, the corresponding target song list visual animation can be displayed at the beginning of playing any song in the target song list.
In some embodiments, a song to be played or being played currently in the target song list may be further processed similar to the above-mentioned target song, for example, lyrics of the song to be played or being played currently are identified, and if a target lyric keyword is matched in the lyrics, a corresponding target lyric visual animation may be displayed when the lyric of the target lyric keyword is played.
For example, as shown in fig. 15, it is assumed that the subject term of the target menu is "cherry season, in japan", and the songs included therein are named "fuji mountain", "bye biblions", and the like.
With continued reference to FIG. 15, a search input box may also be included in which the user may enter songs, singers, lyrics, albums for searching. And the system also can comprise a singer control, a song list control, a live broadcast control and a simultaneous listening control. Clicking the song list control can display the corresponding target song list.
Fig. 16 schematically shows a cherry blossom animation effect diagram displayed when the song in the target menu shown in fig. 15 is played.
According to the visual animation display method provided by the embodiment of the disclosure, different text information can be contained in the subject term of the target song list, different psychological feelings can be brought to a user by different subject terms of the target song list, a series of visual animations according with the meaning of the song keyword are configured in advance, then a corresponding keyword label is marked on a specific vocabulary in the subject term of the target song list, when the song in the target song list is played, the visual animation of the target song list corresponding to the keyword label can be displayed in audio playing interfaces such as a music player/song lyric page/music APP home page and the like, so that a visual egg-like effect matched with the keyword of the target song list is manufactured, and the atmosphere and the surprise of the user listening to music are increased.
Based on the method provided by the embodiment, the embodiment of the disclosure can further adopt an artificial intelligence technology to automatically synthesize the target visual animation, so that the resource consumption of the visual animation in the manufacturing process can be reduced, the memory loss is reduced, the development cost is saved, the manufacturing efficiency of the visual animation is improved, and the automation and the intelligence of the visual animation manufacturing are realized.
FIG. 17 schematically shows a flow diagram of a method of visual animated display according to an embodiment of the present disclosure. As shown in fig. 17, the method provided by the embodiment of the present disclosure may further include the following steps.
In step S1701, the target song is processed by the emotion classifier, and a target emotion classification result of the target song is determined.
In the embodiment of the disclosure, one emotion classifier can be trained in advance, and any two classifiers or multiple classifiers can be adopted for the emotion classifier. Target lyrics (or parts of the target lyrics) of a target song can be input into the trained emotion classifier, the emotion classifier processes the target lyrics, and a target emotion classification result of the target song can be output.
The target emotion classification result is not limited to one, and may be a plurality of target emotion classification results, for example, two target emotion classification results, and may be set according to actual conditions.
In step S1702, target object feature information that triggers the audio selection instruction is obtained.
For example, if a user of the terminal device clicks the audio playing interface to select a target song, the user is a target object for triggering the audio selection instruction.
In the embodiment of the present disclosure, the target object feature information refers to feature information that can embody some characteristic information of the target object itself and some individuation in the process of listening to audio information, such as songs and music, by the target object. For example, some characteristic information of the target object itself may include name, age, gender, geographical location, and feature information of a terminal device used (for example, type, model, etc. of a smart phone) of the target object, some personalized feature information in the process of listening to the audio information by the target object may be a song listening habit of the target object, for example, some common features of the songs extracted by big data analysis on the songs (for example, the songs belong to classical music class, modern popularity class, the same singer, or the same album, etc.), and the target object likes to select a full-screen lyric page or a K song page in the process of listening to songs, etc.
In step S1703, a target background image, a target material, and a target animation effect are determined according to the target emotion classification result and the target object feature information.
For example, as shown in fig. 19, the color painting layer includes a primitive painting layer, a color painting layer, and an animation effect layer (short for animation effect layer). The original image layer corresponds to a music player interface (abbreviated as a player interface in fig. 19, i.e., an audio playing interface). The color egg layer further comprises a background covering layer and a material layer, wherein the background covering layer can be used for selecting a background style to upload a background picture, and the material layer can be used for selecting a material library to upload materials; the animation layer may select animation effects such as drop, fly-in, explosion, fade-in, blur, etc.
After the target emotion classification result and the target object feature information are determined, a target background image matched with the target emotion classification result can be selected from the uploaded background images, a target material matched with the target background image can be selected from the uploaded materials, and a target animation effect matched with the target animation effect can be selected from the provided animation effects.
In step S1704, the target visual animation is generated according to the target background map, the target material, and the target animation effect.
And placing the target background picture on a background covering layer of the colored egg layer, placing target materials on a material layer of the colored egg layer, placing a target animation effect on a dynamic effect layer, combining to generate a target visual animation, and displaying the target visual animation on an original picture layer.
The emotion classifier trained in the above embodiments may employ any suitable type of machine learning model with emotion classification function. For example, a Deep Neural Networks (DNN) model may be trained to obtain emotion classifiers. In the embodiment of the present disclosure, the emotion classifier is obtained by combining a training RNN model with a CNN model, which is described below with reference to fig. 18 as an example.
FIG. 18 schematically shows a flow diagram of a visual animated display method according to an embodiment of the present disclosure. As shown in fig. 18, the method provided by the embodiment of the present disclosure may further include the following steps.
In step S1801, a training data set is obtained, where the training data set includes the sample lyrics and their emotion labels.
As shown in fig. 20, sample lyrics (which may be all lyrics of a song or partial lyrics) may be extracted from a mass lyrics material, emotion tags of the sample lyrics are labeled, different emotion tags correspond to different emotion classification results, for example, emotion tag "0" represents injury or difficulty, emotion tag "1" represents joy, emotion tag "2" represents excitement, emotion tag "3" represents joy, emotion tag "4" represents thinking, and the like. And taking the sample lyrics and the corresponding emotion labels as a training data set to train an emotion classifier.
In step S1802, the text emotion feature of the sample lyrics is extracted through a machine learning model.
In the disclosed embodiments, some data pre-processing may also be performed on the sample lyrics before they are input to the machine learning model.
For example, each line in the sample lyrics is treated as a complete sentence, and the sentences are separated by spaces. These texts are converted into tokens (tokens) that can be recognized by the machine during the data preprocessing stage. Firstly, loading text data corresponding to the sample lyrics, and carrying out descriptive statistics on the text, so that the quantity distribution of the sample lyrics corresponding to various emotion classification results is uniform as much as possible. Then, a dictionary is constructed based on the corpora, and the step of constructing the dictionary is to perform word segmentation and then perform duplication removal on the text. After word segmentation, word drying (Stem) processing can be carried out on the words after word segmentation, and then word frequency statistics is carried out.
In the embodiment of the present disclosure, TF-IDF (term frequency-inverse text frequency index) may be used to count words in the text after word segmentation, for example, for a word whose occurrence frequency is only 1 time, such a word may increase the dictionary capacity and may also bring certain noise to text processing. After the words are removed, on one hand, the dictionary capacity is greatly reduced, the model training is accelerated, and on the other hand, certain noise is also slowed down.
Only words that appear more frequently than 1 in the corpus are retained in the construction of the lexicon. Where < pad > and < unk > are two initialized tokens, < pad > is used for sentence filling, and < unk > is used to replace words that do not appear in the corpus. Finally, a dictionary containing a plurality of words is obtained.
With the lexicon, a word to token mapping and a token to word mapping table are constructed. On the basis of the mapping table, the original text of the sample lyrics can be converted, namely the text is converted into a machine-recognizable code. In addition, in order to ensure that sentences have the same length, the sentence length needs to be processed. For example, assuming statistical findings that the average length of a sentence in a corpus is 20 words, 20 can be set as the standard length of the sentence: truncating the sentence with more than 20 words; pad completion is performed for sentences with less than 20 words.
A function is constructed that can receive a complete string type sentence and convert it into tokens according to a mapping table. In this function, unk codes and pad codes are obtained first for later sentence conversion. The sentence is then mapped and if there are words that have not been seen, then token of unk is used instead. And finally, standardizing the length of the sentence.
Then, the texts of the sample lyrics of the classification results of various emotions are respectively converted to obtain word templates (word vectors) of corresponding words as the extracted text emotion characteristics.
In the embodiment of the present disclosure, word2vec (a group of related models used to generate word vectors) in the machine learning model may be used to map each word in the sample lyrics to a vector, which is a hidden layer of the neural network.
The present disclosure is not limited to using word2vec models to generate word vectors, for example, pre-trained (pre-trained) 300-dimensional word vectors in Glove models may also be used. By loading the word vector, it can be found that in the dictionary constructed by corpus, most words have pre-trained word vectors, and possibly some words have no corresponding pre-trained word vectors, and for these words without word vectors, random value can be directly substituted.
In step S1803, the text emotion feature of the sample lyric is processed by using a recurrent neural network model, so as to obtain a global text feature of the text emotion feature.
In the embodiment of the disclosure, the text emotion characteristics output by the machine learning model may be input into an RNN (Recurrent Neural Network) model for further processing to extract global text characteristics, and context information of the sample lyrics is considered through the extracted global text characteristics, so that emotion included in the sample lyrics can be more accurately determined.
In the embodiment of the present disclosure, an LSTM (Long Short-Term Memory network) model in the RNN model may be used to extract global text features. Because of the gate mechanism in the LSTM, the front and back dependency relationship in the sequence can be well learned and grasped, and because of the gate, the LSTM model can learn which information needs to be retained and which information needs to be forgotten. For example, in the process, when the LSTM model sees "good" in the sample lyrics, it still remembers that there was a negative word "none" before, and similarly, the relationship of "like" and "not" can also be learned. That is, LSTM is better at capturing long sequence relationships and is therefore better suited for Processing long-sequence NLP (Natural Language Processing) problems.
As described above, the sentences word entries in the sample lyrics are obtained, the LSTM sequence (for example, assuming that a single layer of 512 nodes LSTM is used) is transmitted for training, the last hidden state of the LSTM is extracted, and a full-link layer is added to obtain a final output result. In the word embedding, LSTM does not need to sum word vectors, but directly learns the word vectors themselves, so that information loss caused by aggregation operation such as summing or averaging can be avoided.
However, the present disclosure is not limited to the LSTM model as the RNN model, and for example, a GRU (Gated recycling Unit) or a bidirectional LSTM model may be used.
In step S1804, the global Text features are processed through a Text Convolutional Neural network (TextCNN) model to extract local Text features of the Text emotion features, and a predicted emotion classification result of the sample lyrics is obtained according to the local Text features.
Unlike the feature of the LSTM model capturing global text features in long sequences, the TextCNN model may be further incorporated to capture local text features in the disclosed embodiments. The model structure of the TextCNN model sequentially comprises an embedding layer, a convolutional layer, a max-posing layer and a full-connected layer, the global text features are input into the embedding layer, and then the local relation between words can be captured through sliding through convolution operation. After convolution operation, obtaining outputs of the conditional layers and a plurality of column vectors; and extracting the most important information in each column vector through a max-posing operation, namely extracting local important information, for example, capturing a negative relation of local 'dislike', and being helpful for correctly classifying the emotion of the sample lyrics. And finally connecting a layer of full-connected layer to obtain an output result.
The visual animation display method provided by the embodiment of the disclosure trains and obtains the emotion classifier by combining the machine learning model, the cyclic neural network model and the text convolution neural network model, processes and obtains word vectors by using the machine learning model, then captures global text features by using the cyclic neural network model such as LSTM, and captures local text features by using the text convolution neural network model, so that the model training process can be accelerated, and the emotion classification accuracy of the trained emotion classifier is improved. The more accurate the emotion classification of the emotion classifier is, the more accurately the hidden emotion in the target lyric can be recognized, and the more the finally generated or matched target visual animation conforms to the mood embodied by the currently played target song, so that the target visual animation matched with the corresponding emotion can be more accurately displayed.
In step S1805, according to the predicted emotion classification result and the emotion label thereof, the cyclic neural network model and the text convolutional neural network model are trained to obtain the emotion classifier.
In the embodiment of the disclosure, a loss function can be obtained according to the predicted emotion classification result and the emotion label corresponding to the predicted emotion classification result, and when the loss function is converged or reaches a preset iteration number, the RNN model and the TextCNN model can be stopped from being trained to obtain a final emotion classifier.
With reference to fig. 20, after obtaining the emotion classifier through the method of the embodiment in fig. 18, the target lyric of the target song may be input into the machine learning model to obtain the text emotion feature of the target lyric, and then the text emotion feature of the target lyric is input into the trained emotion classifier, and the emotion classifier performs emotion classification on the target lyric to obtain the target emotion classification result of the target song.
For example, if the emotion tag corresponding to the target emotion classification result is a, a target visual animation with a colored egg effect a can be generated from the colored egg template library (including the uploaded background image, the uploaded material, the optional animation effect and the like) in a combined manner according to the method in fig. 19; if the emotion label corresponding to the target emotion classification result is B, the target visual animation with the color egg effect B can be generated in a combined mode from the color egg template library; if the emotion label corresponding to the target emotion classification result is C, target visual animation with the color egg effect C and the like can be generated in a combined mode from the color egg template library.
The method provided by the above embodiment of the present disclosure can be implemented by using a computer vision technology, a speech technology, a natural language processing technology and a machine learning technology in an artificial intelligence technology.
Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Computer Vision technology (CV) is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or is transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D (3-dimension) technology, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.
The key technologies of Speech Technology (Speech Technology) are Automatic Speech Recognition (ASR) and Speech synthesis (Text To Speech, TTS) as well as voiceprint Recognition. The computer can listen, see, speak and feel, and the development direction of the future human-computer interaction is provided, wherein the voice becomes one of the best viewed human-computer interaction modes in the future.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
FIG. 21 schematically illustrates a block diagram of a visual animated display device according to an embodiment of the present disclosure. As shown in fig. 21, the visual animation display device 2100 provided by the embodiment of the present disclosure may include a target audio information determination unit 2110, a target visual key information determination unit 2120, a target visual animation acquisition unit 2130, and a target visual animation display unit 2140.
In the embodiment of the present disclosure, the target audio information determining unit 2110 may be configured to determine, in response to an audio selection instruction for the audio playing interface, target audio information to be currently played. The target visual key information determining unit 2120 may be configured to determine target visual key information that matches the target audio information, wherein the target visual key information includes a keyword that is present in the target audio information and that is suitable for being visually presented. The target visual animation obtaining unit 2130 may be configured to obtain a target visual animation that matches the target visual key information, where the target visual animation is configured to visually present a keyword in the target visual key information. The target visual animation display unit 2140 may be configured to display, in the audio playing interface, a target visual animation that matches the target visual key information during playing of the target audio information.
On one hand, the visual animation display device provided by the embodiment of the disclosure can determine the target audio information to be played currently in response to an audio selection instruction of an audio playing interface, and the target audio information is associated with the target visual key information, so that in the process of playing the target audio information, the target visual animation matched with the target visual key information can be displayed on the audio playing interface, and by making a visual egg-colored effect for a user, the playability and the interest are improved, the interactivity between the user and the user listening to the audio resources is improved, the atmosphere sense of the user listening to the audio resources is enhanced, the visual freshness sense of the user listening to the audio resources is increased, and the aesthetic feeling of the user listening to the audio resources is reduced. On the other hand, with the playing progress of the target audio information, the target visual animation can appear at variable time, the surprise of the user is increased, and curiosity is caused, so that more use is promoted, and waste of resources is avoided.
In an exemplary embodiment, the visual animated display device 2100 may further include: the song keyword acquisition unit can be used for acquiring song keywords in the sample song; a keyword tag obtaining unit, configured to obtain a keyword tag corresponding to the song keyword; the visual animation acquisition unit can be used for acquiring designed visual animation; and the mapping relation obtaining unit can be used for obtaining the mapping relation between the keyword tag and the corresponding visual animation.
In an exemplary embodiment, the target audio information may include a target song, the target visual key information may include a target song name keyword, and the target visual animation may include a target song name visual animation. Among them, the visual animation display device 2100 may further include: a target song name obtaining unit operable to obtain a target song name of the target song; the target song name keyword determining unit can be used for matching the target song name with the song keywords and determining the target song name keywords and keyword labels thereof from the song keywords; and the target song name visual animation determining unit can be used for determining the target song name visual animation corresponding to the keyword label of the target song name keyword from the visual animation according to the mapping relation.
Among them, the target visual animation display unit 2140 may include: and the target song name visual animation display unit can be used for displaying the target song name visual animation on the audio playing interface from the beginning of playing the target song to the playing time of the song name animation.
In an exemplary embodiment, the target audio information may include a target song, the target visual key information may include a target song cover keyword, and the target visual animation may include a target song cover visual animation. Among them, the visual animation display device 2100 may further include: the target song cover display unit can be used for displaying a target song cover of the target song on the audio playing interface; the optical character recognition unit can be used for carrying out optical character recognition on the cover of the target song to obtain a character recognition result of the cover of the target song; a target song cover keyword determining unit, configured to match a text recognition result of the target song cover with the song keyword, and determine the target song cover keyword and a keyword tag thereof from the song keyword; and the target song cover visual animation determining unit can be used for determining the target song cover visual animation corresponding to the keyword label of the target song cover keyword from the visual animation according to the mapping relation. Among them, the target visual animation display unit 2140 may include: and the target song cover visual animation display unit can be used for displaying the target song cover visual animation on the audio playing interface from the beginning of playing the target song to the time length of playing the cover animation.
In an exemplary embodiment, the target audio information may include a target song list, the target visual key information may include a target song list keyword, and the target visual animation may include a target song list visual animation. Among them, the visual animation display device 2100 may further include: the target singing bill subject term obtaining unit can be used for obtaining the target singbill subject term of the target singbill; the target song list keyword determining unit can be used for matching the target song list subject keyword with the song keyword and determining the target song list keyword and a keyword label thereof from the song keyword; and the target song list visual animation determining unit can be used for determining the target song list visual animation corresponding to the keyword label of the target song list keyword from the visual animation according to the mapping relation. Among them, the target visual animation display unit 2140 may include: and the target song list visual animation display unit can be used for displaying the target song list visual animation on the audio playing interface from the beginning of playing the song in the target song list to the playing time of the song list animation.
In an exemplary embodiment, the target audio information may include a target song, the target visual key information may include a target lyric keyword, and the target visual animation may include a target lyric visual animation. Among them, the visual animation display device 2100 may further include: a target lyric obtaining unit, which may be used to obtain target lyrics of the target song; the target lyric keyword determining unit can be used for matching the target lyrics with the song keywords and determining the target lyric keywords and keyword labels thereof from the song keywords; and the target lyric visual animation display unit can be used for determining the target lyric visual animation corresponding to the keyword label of the target lyric keyword from the visual animation according to the mapping relation. Among them, the target visual animation display unit 2140 may include: and the target lyric visual animation display unit can be used for displaying the target lyric visual animation on the audio playing interface from the playing time of the target lyric line where the target lyric keyword is located to the playing time of the lyric animation.
In an exemplary embodiment, the target lyric visual animation display unit may include: the first-time display unit of the target lyric visual animation can be used for displaying the target lyric visual animation corresponding to the target lyric keyword matched for the first time in the target lyric line on the audio playing interface in the process of playing the target lyric line if at least two different target lyric keywords exist in the target lyric line.
In an exemplary embodiment, the target lyric visual animation display unit may include: a predetermined number of words forward recognition unit operable to recognize a predetermined number of words in the target lyric line forward from the target lyric keyword; a matching negative word obtaining unit, configured to match the predetermined number of word-or-negative word lists to obtain a matching negative word; and the target lyric visual animation display unit can be used for displaying the target lyric visual animation on the audio playing interface from the time of playing to the target lyric line to the time of playing the lyric animation if the number of the matched negative words is even.
In an exemplary embodiment, the target lyric obtaining unit may include: a target lyric obtaining unit, which can be used for obtaining the target lyrics of the target song before the target song starts to be played; or when the target song is played, the target lyrics are displayed and obtained on the audio playing interface; or when the target song is played, obtaining the target song audio of the target song; and carrying out voice recognition on the target song audio to obtain the target lyrics.
In an exemplary embodiment, the target visual key information may include a target song keyword. Among them, the target visual animation display unit 2140 may include: the target song keyword determining unit can be used for matching the target song with the song keywords and determining the target song keywords from the song keywords; the same target visual animation single-time display unit may be configured to, if the same target song keyword is repeatedly identified from the target song, display, in the audio playing interface, a target visual animation that matches the same target song keyword when the same target song keyword is identified for the first time in the process of playing the target song.
In an exemplary embodiment, the target audio information may include a target song. Among them, the visual animation display device 2100 may further include: the target emotion classification result determining unit can be used for processing the target song through an emotion classifier and determining a target emotion classification result of the target song; a target object characteristic information obtaining unit, configured to obtain target object characteristic information that triggers the audio selection instruction; the target material determining unit can be used for determining a target background image, a target material and a target animation effect according to the target emotion classification result and the target object feature information; and the target visual animation generating unit can be used for generating the target visual animation according to the target background picture, the target material and the target animation effect.
In an exemplary embodiment, the visual animated display device 2100 may further include: the training data set acquisition unit can be used for acquiring a training data set, wherein the training data set comprises sample lyrics and emotion labels thereof; the text emotional feature extraction unit can be used for extracting the text emotional features of the sample lyrics through a machine learning model; the global text feature obtaining unit can be used for processing the text emotional features of the sample lyrics by utilizing a recurrent neural network model to obtain the global text features of the text emotional features; the emotion classification result prediction unit is used for processing the global text features through a text convolution neural network model so as to extract local text features of the text emotion features and obtain a predicted emotion classification result of the sample lyrics according to the local text features; and the emotion classifier training unit can be used for training the cyclic neural network model and the text convolutional neural network model according to the predicted emotion classification result and the emotion label thereof so as to obtain the emotion classifier.
Other contents of the visual animated display device of the embodiment of the present disclosure may refer to the above-described embodiments.
It should be noted that although in the above detailed description several units of the device for action execution are mentioned, this division is not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.
Reference is now made to fig. 22, which is a block diagram illustrating an electronic device suitable for use in implementing embodiments of the present application. The electronic device shown in fig. 22 is merely an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
Referring to fig. 22, an electronic device provided by an embodiment of the present disclosure may include: a processor 2201, a communication interface 2202, a memory 2203, and a communication bus 2204.
Wherein the processor 2201, the communication interface 2202, and the memory 2203 communicate with each other via a communication bus 2204.
Alternatively, the communication interface 2202 may be an interface of a communication module, such as an interface of a GSM (Global System for Mobile communications) module. The processor 2201 is used to execute programs. The memory 2203 is used for storing programs. The program may comprise a computer program comprising computer operating instructions. Wherein, can include in the procedure: and (5) a game client program.
The processor 2201 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present disclosure.
The memory 2203 may include a Random Access Memory (RAM) memory, and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory.
Among them, the procedure can be specifically used for: responding to an audio selection instruction of an audio playing interface, and determining target audio information to be played currently; determining target visual key information that matches the target audio information, wherein the target visual key information comprises keywords that are present in the target audio information and that are suitable for visual presentation; acquiring a target visual animation matched with the target visual key information, wherein the target visual animation is used for presenting keywords in the target visual key information in a visual mode; and in the process of playing the target audio information, displaying the target visual animation matched with the target visual key information on the audio playing interface.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations of the embodiments described above.
It is to be understood that any number of elements in the drawings of the present disclosure are by way of example and not by way of limitation, and any nomenclature is used for differentiation only and not by way of limitation.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.