Online learning recommendation method and device based on big data and computer equipment
1. An online learning recommendation method based on big data is characterized by comprising the following steps:
acquiring a user learning behavior log and unstructured industrial data;
processing the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result;
inputting the first recommendation result and the second recommendation result into a cache;
and outputting the first recommendation result and the second recommendation result from the cache.
2. The big data-based online learning recommendation method according to claim 1, wherein the processing the user learning behavior log by an online recommendation component to obtain a first recommendation result comprises:
the method comprises the steps of obtaining a user learning behavior log through a flash component, storing the user learning behavior log in a Kafka intermediate cache component, and processing the user learning behavior log through a Spark Streaming component to obtain a first recommendation result.
3. The big data-based online learning recommendation method according to claim 1, wherein the processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result comprises:
data mining is carried out on the unstructured industry data through a Spark SQL component;
and processing unstructured industrial data through a Spark MLib component to obtain a second recommendation result.
4. The big data-based online learning recommendation method according to claim 1, wherein the processing of unstructured industry data through a Spark MLib component to obtain a second recommendation result comprises:
and performing feature extraction, conversion and selection on the unstructured industry data based on a User-based algorithm, an Item-based algorithm and a Model-based algorithm to obtain a second recommendation result.
5. The big-data-based online learning recommendation method according to claim 1, wherein the user learning behavior log comprises: login information, course information, examination information and learning courseware information; the unstructured industry data comprises information of affiliated departments, post information, job level information, academic calendar information, job entry date information and gender information.
6. An online learning recommendation device based on big data is characterized by comprising:
the data acquisition module is used for acquiring a user learning behavior log and unstructured industry data;
the recommendation result acquisition module is used for processing the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result;
the input module is used for inputting the first recommendation result and the second recommendation result into the cache;
and the output module is used for outputting the first recommendation result and the second recommendation result from the cache.
7. The big-data-based online learning recommendation device according to claim 6, wherein the recommendation result obtaining module comprises:
and the first recommendation result acquisition submodule is used for acquiring a user learning behavior log through a flash component, storing the user learning behavior log in a Kafka intermediate cache component, and processing the user learning behavior log through a Spark Streaming component to obtain a first recommendation result.
8. The big-data-based online learning recommendation device according to claim 6, wherein the recommendation result obtaining module comprises:
the data mining submodule is used for carrying out data mining on the unstructured industry data through a Spark SQL component;
and the second recommendation result acquisition submodule is used for processing unstructured industrial data through a Spark MLib component to obtain a second recommendation result.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the big-data based online learning recommendation method according to any one of claims 1 to 5.
10. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the big data based online learning recommendation method according to any one of claims 1 to 5.
Background
With the development of on-line training requirements, personalized precise training becomes more and more important, and user figures such as different posts, different levels and different parts of employees in an enterprise and learning behavior habits push targeted training contents for students;
the current recommendation method is based on some simple rules: forming recommendation results in different dimensions of highest scoring, most browsing, most learning people and the like; based on the capability model: planning a capability model of each post and level of the whole company; most of the training platforms for online learning in the market are simple rule-based recommendations and are not true intelligent recommendations, and the recommendations have the following defects;
on the first hand, it is difficult to make rules suitable for each individual, and the requirements of users cannot be met; the rules need to be updated continuously, the cost is high, and the efficiency is low;
and in addition, with the development of companies, the capability model needs to be updated regularly, so that the investment cost of the enterprise is high, and the visibility is low.
Disclosure of Invention
In view of the above problems, embodiments of the present invention are provided to provide a big-data-based online learning recommendation method, a big-data-based online learning recommendation apparatus, a computer device, and a storage medium that overcome or at least partially solve the above problems.
In order to solve the above problems, an embodiment of the present invention discloses an online learning recommendation method based on big data, including:
acquiring a user learning behavior log and unstructured industrial data;
processing the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result;
inputting the first recommendation result and the second recommendation result into a cache;
and outputting the first recommendation result and the second recommendation result from the cache.
Preferably, the processing, by the online recommendation component, the user learning behavior log to obtain a first recommendation result includes:
the method comprises the steps of obtaining a user learning behavior log through a flash component, storing the user learning behavior log in a Kafka intermediate cache component, and processing the user learning behavior log through a Spark Streaming component to obtain a first recommendation result.
Preferably, the processing the unstructured industry data by the offline recommendation component to obtain a second recommendation result includes:
data mining is carried out on the unstructured industry data through a Spark SQL component;
and processing unstructured industrial data through a Spark MLib component to obtain a second recommendation result.
Preferably, the processing of the unstructured industry data through the Spark MLib component to obtain the second recommendation result includes:
and performing feature extraction, conversion and selection on the unstructured industry data based on a User-based algorithm, an Item-based algorithm and a Model-based algorithm to obtain a second recommendation result.
Preferably, the user learning behavior log comprises: login information, course information, examination information and learning courseware information; the unstructured industry data comprises information of affiliated departments, post information, job level information, academic calendar information, job entry date information and gender information.
The embodiment of the invention also discloses an online learning recommendation device based on big data, which comprises:
the data acquisition module is used for acquiring a user learning behavior log and unstructured industry data;
the recommendation result acquisition module is used for processing the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result;
the input module is used for inputting the first recommendation result and the second recommendation result into the cache;
and the output module is used for outputting the first recommendation result and the second recommendation result from the cache.
Preferably, the recommendation obtaining module includes:
and the first recommendation result acquisition submodule is used for acquiring a user learning behavior log through a flash component, storing the user learning behavior log in a Kafka intermediate cache component, and processing the user learning behavior log through a Spark Streaming component to obtain a first recommendation result.
Preferably, the recommendation obtaining module includes:
the data mining submodule is used for carrying out data mining on the unstructured industry data through a Spark SQL component;
and the second recommendation result acquisition submodule is used for processing unstructured industrial data through a Spark MLib component to obtain a second recommendation result.
The embodiment of the invention also discloses computer equipment which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the online learning recommendation method based on the big data when executing the computer program.
The embodiment of the invention also discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the online learning recommendation method based on big data are realized.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, the online learning recommendation method based on big data comprises the following steps: acquiring a user learning behavior log and unstructured industrial data; processing the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result; inputting the first recommendation result and the second recommendation result into a cache; outputting the first recommendation result and the second recommendation result from the cache; performing online recommendation calculation on big data according to the mass logs to realize the recommendation content of the learning behavior of the user; and performing big data offline recommendation calculation according to the structured offline data to realize the recommendation content of the user portrait and the learning behavior.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts
FIG. 1 is a flowchart illustrating steps of an embodiment of a big data-based online learning recommendation method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first recommendation obtaining step according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a second recommendation obtaining step according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a second recommendation obtaining step according to an embodiment of the present invention;
FIG. 5 is a block diagram of an embodiment of an online learning recommendation apparatus based on big data according to an embodiment of the present invention;
FIG. 6 is an internal block diagram of a computer device of an embodiment.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects solved by the embodiments of the present invention more clearly apparent, the embodiments of the present invention are described in further detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a big data-based online learning recommendation method according to an embodiment of the present invention is shown, which may specifically include the following steps:
step 101, acquiring a user learning behavior log and unstructured industrial data;
the recommendation method of the embodiment of the present invention may be applied to a learning platform, and the learning platform may be run on a plurality of terminals, such as a Personal Computer (PC), a smart Phone, a tablet Computer, and other terminals that may be installed with an application program, for example, a smart watch, and the like.
The terminal may first obtain a user learning behavior log and unstructured industry data, where the user learning behavior log may include: login information, course information, examination information, courseware learning information and the like, and certainly, other user learning behaviors can be included, and the embodiment of the invention does not limit the information;
on the other hand, the unstructured industry data may include information such as department information, post information, job level information, academic information, entry date information, gender information, and other unstructured industry data, such as national information, political face information, and the like, which are not limited in the embodiments of the present invention.
102, processing the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result;
the method and the device are particularly applied to the embodiment of the invention, the user learning behavior log can be processed through the online recommendation component to obtain a first recommendation result, and on the other hand, the unstructured industry data can be processed through the offline recommendation component to obtain a second recommendation result.
Specifically, the online recommendation component mainly comprises a Spark Streaming component and the like, and the Spark Streaming component is used for performing word segmentation or statistics and other processing on the user learning behavior log to obtain a first recommendation result.
In addition, the offline recommendation component may include a Spark SQL component, a Spark MLib component, and the like, and the second recommendation result is obtained by data mining through the Spark SQL component and data feature extraction, conversion, and selection through the Spark MLib component.
Step 103, inputting the first recommendation result and the second recommendation result into a cache;
furthermore, the first recommendation result and the second recommendation result can be input into a Redis cache, and the following system reads the recommendation results, so that the reading efficiency is improved.
And 104, outputting the first recommendation result and the second recommendation result from the cache.
In practical application to the embodiment of the present invention, the first recommendation result and the second recommendation result may be output from the cache to a display device, and the like, and displayed on a display device of a user.
In the embodiment of the invention, the online learning recommendation method based on big data comprises the following steps: acquiring a user learning behavior log and unstructured industrial data; processing the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result; inputting the first recommendation result and the second recommendation result into a cache; outputting the first recommendation result and the second recommendation result from the cache; performing online recommendation calculation on big data according to the mass logs to realize the recommendation content of the learning behavior of the user; and performing big data offline recommendation calculation according to the structured offline data to realize the recommendation content of the user portrait and the learning behavior.
In the embodiment of the present invention, referring to fig. 2, a flowchart of a first recommendation result obtaining step in the embodiment of the present invention is shown, where the processing on the user learning behavior log by the online recommendation component to obtain the first recommendation result includes the following sub-steps:
and a substep 11, acquiring a user learning behavior log through a flash component, storing the user learning behavior log in a Kafka intermediate cache component, and processing the user learning behavior log through a Spark Streaming component to obtain a first recommendation result.
Actually applied to the embodiment of the present invention, the online recommendation component may include a flash component, a Kafka intermediate cache component, and a Spark Streaming component, and the first recommendation result is obtained by collecting a user learning behavior log through the flash component, writing the user learning behavior log into the Kafka intermediate cache component, and performing related data operations such as word segmentation or statistics through the Spark Streaming component.
In the embodiment of the present invention, referring to fig. 3, a flowchart of a second recommendation result obtaining step in the embodiment of the present invention is shown, where the processing on the unstructured industry data by the offline recommendation component to obtain the second recommendation result includes the following sub-steps:
a substep 21 of data mining for the unstructured business data by means of Spark SQL components;
and a substep 22, processing unstructured industrial data through a Spark MLib component to obtain a second recommendation result.
Further applied to the embodiment of the invention, data mining can be performed on the unstructured industry data through a Spark SQL component; and then, data feature extraction, conversion and selection are carried out by utilizing a Spark MLib component to obtain a second recommendation result.
In the embodiment of the present invention, referring to fig. 4, a flowchart illustrating a second recommendation result obtaining step in the embodiment of the present invention is shown, where the processing of unstructured industrial data by using a Spark MLib component to obtain the second recommendation result includes the following sub-steps:
and a substep 31, performing feature extraction, conversion and selection on the unstructured industry data based on a User-based algorithm, an Item-based algorithm and a Model-based algorithm to obtain a second recommendation result.
In a preferred embodiment, the Spark MLib component provides distributed implementation of a common machine learning algorithm, and can perform feature extraction, conversion and selection on unstructured industry data based on a User-based algorithm, Item-based and Model-based to obtain a second recommendation result.
In order that those skilled in the art will better understand the embodiments of the present invention, the following description is given by way of a specific example:
the emphasis includes the following modules:
online learning system log rule definition: defining a user learning log rule generated by an online learning platform, and generating logs of various learning behaviors of a user as much as possible, such as all actions related to learning, such as login, course opening, examination taking, courseware learning and the like;
and (3) online recommendation calculation: the method realizes the calculation of the big data of a large amount of logs to obtain the required push data by integrating the flash, the Kafka and the Spark Streaming. The flash is highly available, highly reliable and distributed, and writes data into the Kafka through a system for acquiring, aggregating and transmitting mass user behavior logs, and the acquired data and the processed data are not necessarily synchronous, so that the Kafka is added as an intermediate cache, and the phenomenon that data are lost when a Spark Streaming file is lost is avoided;
the data in Kafka is processed by real-time Streaming of Spark Streaming, and the data is received and converted into a data structure Dstream in Spark Streaming in the first step. There are two ways to receive data: 1. the method includes the steps that a Receiver is used for receiving data, 2, the data are directly read from kafka, the system adopts a direct reading mode, the mode has the characteristics of simplified parallelism, high efficiency, accuracy and the like, and then required push data are written into a Redis cache through related processing such as word segmentation, statistics and the like.
And (3) offline recommendation calculation: besides log recording of unstructured industrial data of users, a large number of structured users and learning data can be recorded in a learning system, such as departments to which the users belong, posts, job levels, academic calendars, job dates, nationalities, political appearances, sexes, duration of learning courses, examination answer conditions and the like;
the method is characterized in that Spark SQL and Spark MLlib are integrated, Spark SQL is a distributed query engine, and data mining is performed on offline structured user portraits and learning behavior data by utilizing Spark SQL. A machine learning library, namely Spark MLib, based on mass data provided by Spark provides distributed implementation of a common machine learning algorithm, data feature extraction, conversion and selection are carried out based on a User-based algorithm (considering the similarity between users), Item-based and Model-based, and required push data are written into a Redis cache.
1. Spark distributed computing principle
Before introducing the distributed machine learning training method of Spark MLlib, let us first review the distributed computing principle of Spark, which is the basis of distributed machine learning.
Spark, a distributed computing platform. The distributed mode means that the computing nodes do not share a memory and need to exchange data in a network communication mode. It is clear that the most typical application of Spark is based on a large number of inexpensive computing nodes, which may be inexpensive hosts or virtual docker containers; but this approach is different from a CPU + GPU architecture, or a high performance server architecture that shares a memory multiprocessor. It is important to understand the following principle of Spark calculation.
The Spark Program is scheduled and organized by the Manager Node, specific calculation tasks are executed by the Worker Node, and the result is finally returned to the Drive Program. On a physical worker node, the data may also be divided into different partitions, which can be said to be basic processing units of spark.
When a specific program is executed, Spark will disassemble the program into a task DAG (directed acyclic graph), and then determine the method for executing each step of the program according to the DAG. The program reads files from textFile and HadoopFile respectively, and performs join after a plurality of series of operations to finally obtain a processing result.
2. Spark MLlib parallel training principle
With the basis of Spark distributed computing process, the principle of Spark MLlib parallel training can be more clearly understood as follows.
In all mainstream machine learning models, the structural characteristics of the Random Forest model determine that the Random Forest model can be completely trained in a data parallel model, the structural characteristics of the GBDT determine that only serial training can be carried out between trees, the implementation mode of spark is not described again, and the emphasis is placed on the implementation of a gradient descent method, because the realization quality of the gradient descent parallel degree directly determines the training speed of a deep learning model represented by the Multiple Layer Perception based on Logistic Regression.
And (4) returning a recommendation result: and finally writing the online recommended data and the offline recommended data into Redis, developing an interface for online learning, and returning the recommended data to an online learning system to realize learning content recommendation based on big data.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 5, a block diagram illustrating a structure of an embodiment of a big data-based online learning recommendation apparatus according to an embodiment of the present invention may specifically include the following modules:
the data acquisition module 301 is configured to acquire a user learning behavior log and unstructured industry data;
a recommendation result obtaining module 302, configured to process the user learning behavior log through an online recommendation component to obtain a first recommendation result; processing the unstructured industry data through an offline recommendation component to obtain a second recommendation result;
an input module 303, configured to input the first recommendation result and the second recommendation result into the cache;
an output module 304, configured to output the first recommendation result and the second recommendation result from the cache.
Preferably, the recommendation obtaining module includes:
and the first recommendation result acquisition submodule is used for acquiring a user learning behavior log through a flash component, storing the user learning behavior log in a Kafka intermediate cache component, and processing the user learning behavior log through a Spark Streaming component to obtain a first recommendation result.
Preferably, the recommendation obtaining module includes:
the data mining submodule is used for carrying out data mining on the unstructured industry data through a Spark SQL component;
and the second recommendation result acquisition submodule is used for processing unstructured industrial data through a Spark MLib component to obtain a second recommendation result.
Preferably, the second recommendation result obtaining sub-module includes:
and the second recommendation result acquisition unit is used for performing feature extraction, conversion and selection on the unstructured industry data based on a User-based algorithm, an Item-based algorithm and a Model-based algorithm to obtain a second recommendation result.
Preferably, the user learning behavior log comprises: login information, course information, examination information and learning courseware information; the unstructured industry data comprises information of affiliated departments, post information, job level information, academic calendar information, job entry date information and gender information.
The modules in the big data-based online learning recommendation device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The online learning recommendation device based on big data provided by the above can be used for executing the online learning recommendation method based on big data provided by any of the above embodiments, and has corresponding functions and beneficial effects.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a big data based online learning recommendation method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory having a computer program stored therein and a processor which when executed implements the steps of the embodiments of fig. 1-X.
In one embodiment, a computer readable storage medium is provided, having stored thereon a computer program, which when executed by a processor, performs the steps of the embodiments of fig. 1-X below.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The online learning recommendation method based on big data, the online learning recommendation device based on big data, the computer equipment and the storage medium provided by the invention are described in detail, specific examples are applied in the text to explain the principle and the implementation mode of the invention, and the description of the above embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:一种粗排序的方法及装置