Resource scheduling method and device
1. A method for scheduling resources, comprising:
acquiring a first scheduling request message, wherein the first scheduling request message is used for requesting a first amount of resources for a first type of task, and the first type of task is a Service Level Agreement (SLA) sensitive task;
determining a resource server in a resource pool, wherein the first number of resources are scheduled for running the first type of task in the resource server;
acquiring a second scheduling request message, wherein the second scheduling request message is used for requesting a second quantity of resources for a second type of task, and the second type of task is an SLA insensitive task;
scheduling a third amount of resources in the resource server to run the second type of task, wherein the third amount of resources is less than or equal to the second amount of resources.
2. The method of claim 1, wherein the first type of task is capable of acquiring no more than the first amount of resources at any time during execution according to its needs.
3. A method according to claim 1 or 2, wherein the task of the second type can be interrupted during execution, reclaiming resources used by the task of the second type.
4. The method of any of claims 1 to 3, further comprising:
and determining that the load rate of the resource server is greater than a first threshold value, and transferring the tasks of the first type or transferring the tasks of the second type to other resource servers of the resource pool.
5. The method of any of claims 1 to 4, wherein said scheduling a third amount of resources in said resource server to run said second type of task comprises:
putting the second scheduling request into a waiting queue;
and determining that the load rate of the resource server is smaller than a second threshold value, and allocating the resource of the resource server to the scheduling request in the waiting queue.
6. The method of any of claims 1 to 5, wherein a sum of the first number of resources and the second number of resources exceeds a total capacity of resources of the resource server.
7. The method according to any of claims 1 to 6, wherein the first type of task is a task performing resource scheduling based on a resource request amount, and the second type of task is a task performing resource scheduling based on a resource usage amount.
8. A resource scheduling apparatus, comprising:
a first scheduler, configured to obtain a first scheduling request message, where the first scheduling request message is used to request a first amount of resources for a first type of task, and the first type of task is a Service Level Agreement (SLA) -sensitive task; determining a resource server in a resource pool, wherein the first number of resources are scheduled for running the first type of task in the resource server;
a second scheduler, configured to obtain a second scheduling request message, where the second scheduling request message is used to request a second number of resources for a second type of task, and the second type of task is an SLA-insensitive task; scheduling a third amount of resources in the resource server to run the second type of task, wherein the third amount of resources is less than or equal to the second amount of resources.
9. An apparatus for resource scheduling, comprising at least one processor coupled with at least one memory:
the at least one processor configured to execute computer programs or instructions stored in the at least one memory to cause the apparatus to perform the method of any of claims 1-7.
10. A resource scheduling system comprising at least one resource server and a resource scheduling apparatus according to any one of claim 8.
11. A readable storage medium, comprising a program or instructions which, when executed, perform the method of any of claims 1 to 7.
Background
When a cloud data center is built, operators of the cloud data center invest a large amount of funds for purchasing computing facilities such as servers and switches, and the funds are used for providing computing resources for cloud computing services. In order to improve the utilization rate of computing resources of the cloud data center, operators schedule computing tasks of different tenants to the same computing facility through resource multiplexing technologies such as virtualization. Taking cloud host service as an example, a tenant selects a cloud host with appropriate resource configuration from a cloud host type list provided by an operator for renting according to the requirement of task execution of the tenant. When a tenant sends a starting request of a cloud host, an operator selects one physical server from all physical servers of a cloud data center through a public cloud scheduling system according to the resource configuration of the cloud host selected by the tenant, and starts a virtual machine on the physical server to serve as the cloud host rented by the tenant. In the process, the reasonable virtual machine scheduling method can effectively reduce resource fragments in each physical server in the cloud data center, so that higher resource utilization rate is ensured.
Therefore, how to schedule resources better is an urgent problem to be solved.
Disclosure of Invention
The embodiment of the application provides a resource scheduling method and device, which are used for improving the service quality while improving the resource utilization rate.
In a first aspect, an embodiment of the present application provides a resource scheduling method, where when a first scheduling request message is acquired, a first resource server is determined from a resource pool according to a first amount of resources requested by the first scheduling request message, and the first amount of resources are scheduled in the first resource server; the resource pool comprises at least one resource server; the first scheduling request message is used for requesting resources for a first type of task; when a second scheduling request message is acquired, if it is determined that resources are scheduled for a task corresponding to the second scheduling request message according to the resource load rate of the resource pool, determining a second resource server from the resource pool according to a second quantity of resources requested by the second scheduling request message, and scheduling a third quantity of resources in the second resource server; the third number is less than or equal to the second number; the second scheduling request message is for requesting resources for a second type of task.
Through the method, firstly, when the first resource server and the second resource server are the same resource server, the first type of task and the second type of task can be scheduled to the same resource server, so that the server resources which are applied by the first type of task but are not used can be effectively utilized, the resource waste under a public cloud scene is effectively avoided, and a public cloud operator can reduce the purchase of hardware resources such as servers for executing the second type of task, and the service cost of the operator is saved.
In a possible design, after the second scheduling request message is obtained, the second scheduling request message may be further placed into a waiting queue, where the waiting queue includes at least one scheduling request message of a second type of task; thus, determining to schedule the resource for the task corresponding to the second scheduling request message according to the resource load rate of the resource pool may include: if the resource load rate of the resource pool is determined to be smaller than the first threshold, and the task corresponding to the second scheduling request message is: and determining the task with the minimum request resource quantity or the task with the longest waiting time in the waiting queue as the task scheduling resource corresponding to the second scheduling request message.
In one possible design, the method further includes: and if the resource load rate of the resource pool is determined to be greater than or equal to a first threshold value, selecting M tasks of the second type from a plurality of tasks executed by the at least one resource server, and releasing resources occupied by the M tasks of the second type, wherein M is an integer greater than 0.
In another possible design, the method further includes: if the number of the idle resources in the second resource server is smaller than a second threshold value, selecting N tasks of the second type from a plurality of tasks executed by the second resource server, and releasing resources occupied by the N tasks of the second type, wherein N is an integer larger than 0.
By the method, the second type task on the resource server with higher load can be interrupted in time by monitoring and predicting and analyzing the resource load, the situation that the second type task occupies resources required by the first type task is avoided, and the influence on the resource use of the first type task is avoided.
In one possible design, the method further includes: and putting the N second-type tasks into a waiting queue, wherein the waiting queue comprises at least one scheduling request message of the second-type task.
In one possible design, when the first resource server is determined from the resource pool according to the first number of resources requested by the first scheduling request message, a resource server with a number of free resources greater than the first number may be selected as the first resource server from at least one resource server included in the resource pool.
In a possible design, when determining the second resource server from the resource pool according to the second amount of resources requested by the second scheduling request message, a resource server with an amount of idle resources greater than the third amount may be selected as the second resource server from at least one resource server included in the resource pool.
In the method, the first type task may be a task for performing resource scheduling based on a resource request amount, and the second type task may be a task for performing resource scheduling based on a resource usage amount.
In a second aspect, an embodiment of the present application provides a resource scheduling apparatus, which includes a processor coupled with a memory, wherein: the memory is used for storing instructions; the processor is configured to perform the method of the first aspect or any one of the possible designs of the first aspect, in accordance with instructions stored by the execution memory.
In a third aspect, an embodiment of the present application provides a resource scheduling apparatus, configured to implement any one of the foregoing first aspect or the first aspect, where the resource scheduling apparatus includes corresponding functional modules, for example, a first scheduler, a second scheduler, a load control module, and the like, which are respectively configured to implement the steps in the foregoing method.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having computer-readable instructions stored thereon, which, when read and executed by a computer, cause the computer to perform the method of any one of the above aspects or any one of the possible designs of any one of the above aspects.
In a fifth aspect, embodiments of the present application provide a computer program product which, when read and executed by a computer, causes the computer to perform the method of any one of the above aspects or any one of the possible designs of any one of the aspects.
In a sixth aspect, an embodiment of the present application provides a chip, where the chip is connected to a memory, and is configured to read and execute a software program stored in the memory, so as to implement the method in any one of the above aspects or any one of the possible designs of any one of the above aspects.
In a seventh aspect, an embodiment of the present application provides a resource scheduling system, including the resource scheduling apparatus in the second aspect and multiple resource servers.
Drawings
Fig. 1 is a schematic structural diagram of a resource scheduling system according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a resource scheduling method according to an embodiment of the present application;
fig. 3 is a structural framework diagram of a resource scheduling device according to an embodiment of the present application;
fig. 4 is a schematic diagram of a monitoring data transmission flow provided in an embodiment of the present application;
fig. 5 is a schematic resource scheduling diagram of a first type of task according to an embodiment of the present application;
fig. 6 is a schematic diagram of resource scheduling according to an embodiment of the present application;
fig. 7 is a schematic diagram of resource scheduling according to an embodiment of the present application;
FIG. 8 is a schematic resource scheduling diagram of a second type of task according to an embodiment of the present application;
fig. 9 is a schematic diagram of resource scheduling according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a resource scheduling apparatus according to an embodiment of the present application.
Detailed Description
The embodiments of the present application will be described in detail below with reference to the drawings attached hereto.
For the convenience of understanding the embodiments of the present application, a system to which the embodiments of the present application are applied will be first described in detail by taking the system architecture shown in fig. 1 as an example. Fig. 1 shows an architecture diagram of a public cloud service system suitable for the embodiment of the present application. As shown in fig. 1, a resource pool, which may include at least one resource server, and a resource scheduling device for controlling the resource pool are included.
The tenants submit scheduling request messages for different types of tasks to the resource scheduling device from different service interfaces through a console (not shown in fig. 1) in the resource scheduling device of the public cloud service. And the resource scheduling equipment schedules the resource server from the resource pool according to the scheduling request messages of the tasks of different types, and schedules corresponding processing resources from the scheduled resource server to be distributed to the tenants for use.
The architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application, and it is known by a person skilled in the art that the technical solution provided in the embodiment of the present application is also applicable to similar system architectures along with the evolution of the architecture and the appearance of a new service scenario.
With reference to fig. 2, a schematic flow chart of a resource scheduling method provided in the embodiment of the present application is shown in combination with the above application scenario.
The method comprises the following steps:
step 201: when acquiring a first scheduling request message, a resource scheduling device is used for requesting resources for a first type of task, and according to a first amount of resources requested by the first scheduling request message, the resource scheduling device determines a first resource server from a resource pool and schedules the first amount of resources in the first resource server.
The task type refers to a service type corresponding to the task, and the task for executing one service type may be referred to as a type of task. Resources herein include, but are not limited to, processor resources, memory resources, bandwidth resources, and the like.
Step 202: when the resource scheduling equipment acquires a second scheduling request message, the second scheduling request message is used for requesting resources for a second type of task, if the resource scheduling equipment determines that the resource is the task scheduling resource corresponding to the second scheduling request message according to the current resource load rate of a resource pool, a second resource server is determined from the resource pool according to a second quantity of resources requested by the second scheduling request message, and a third quantity of resources are scheduled in the second resource server; wherein the third number is less than or equal to the second number.
The first resource server and the second resource server may be the same resource server, or may be two different resource servers, which is not limited in this embodiment of the application.
By the embodiment, two different scheduling methods are used for two different types of tasks, so that the resource utilization rate can be improved, the cost of an operator can be saved, and the influence on Service Level Agreement (SLA) sensitive tasks can be avoided. Firstly, a scheduling method according to the actual usage amount of resources is added to a system scheduled according to the resource request amount, so that the first type of tasks and the second type of tasks can be scheduled to the same resource server, the resources which are applied by the first type of tasks but are not used on the same resource server are effectively utilized, and the resource waste under a public cloud scene can be effectively avoided. Secondly, since the second type of task can be scheduled to the resource left by the first type of task, the public cloud operator can reduce the purchase of hardware resources such as a server for executing the second type of task, thereby saving the service cost of the operator.
Fig. 3 is a schematic diagram of an architecture of a resource scheduling device in a public cloud service system according to an embodiment of the present application. As shown in fig. 3, the resource scheduling apparatus includes a first scheduler, a second scheduler, a load control module, a wait queue, a message queue module, and the like. The scheduling request message of the first type of task may be submitted to a first scheduler and the scheduling request message of the second type of task may be submitted to a second scheduler.
The first scheduler is used for obtaining the scheduling request message of the first type of task, and the second scheduler is used for obtaining the scheduling request message of the second type of task. The first type of task may be a task for performing resource scheduling based on a resource request amount; the second type of task may be a task that performs resource scheduling based on resource usage. In this embodiment of the present application, the first type of task may also be referred to as a Service Level Agreement (SLA) sensitive task, and the second type of task may also be referred to as an SLA insensitive task. In the execution process, the SLA sensitive task can acquire the resources which do not exceed the resource request amount according to the requirements at any time. And the SLA insensitive task can obtain resources less than the resource request amount of the SLA insensitive task in the execution process, and when the load of the resource pool is too high, the resources in use can be recycled, so that the task execution is interrupted.
In the embodiment of the application, the corresponding type, namely the first type and the second type, is set for each task in advance, and each type of task can only apply for resources through the corresponding scheduler, so that the resource utilization rate can be improved through the second type of task, and meanwhile, the influence on the first type of task is avoided.
After the first scheduler acquires the scheduling request message of the first type of task, according to the resource request quantity of the first type of task, the first scheduler ensures that the type of task can acquire the quantity of resources equal to the request quantity of the first type of task at any time.
After the second scheduler acquires the scheduling request message of the second type of task, the second scheduler does not allocate the requested amount of resources to the second type of task at first, but puts the scheduling request message of the task into a waiting queue for waiting. When determining to schedule resources for the type of task according to the resource load rate of the resource pool, allocating the quantity of the resources which is not more than the request quantity of the task at most according to the actual resource usage quantity. The second scheduler can monitor and predict the actual resource usage of each resource server through the load control module, and when the predicted value of the actual resource usage of the task is increased, part of the second type of task is closed in time, so that the first type of task can have sufficient resource usage.
Each resource server in the resource pool comprises an agent module, and the agent module is responsible for executing resource allocation decision of the scheduler on one hand and monitoring resource load of the resource server where the agent module is located and actual resource usage of each task in the resource server on the other hand. After a resource server is selected for task execution, the scheduler transmits data and information related to the task to the agent module on the resource server. The agent module prepares an execution environment for the task to be executed in the resource server according to the decision of the scheduler, allocates the required resources and creates a task instance. When the scheduler decides to interrupt part of SLA insensitive tasks on a certain resource server, the related information of the interrupted tasks is transmitted to the agent module on the resource server, the agent module interrupts the execution of the tasks and releases the occupied resources.
In the embodiment of the application, the agent module in each resource server periodically reads the resource actual use data of each task in the resource server, and after analysis and summary, the agent module in each resource server periodically sends monitoring data to the message queue module. The monitoring data includes, but is not limited to, resource load rates of the resource servers, actual resource usage by each task performed, task types for each task performed, and the like.
For example, as shown in fig. 4, a schematic diagram of a monitoring data sending process provided in the embodiment of the present application is shown.
Step 401, the resource servers 1 to K included in the resource pool periodically send monitoring data to the message queue module.
Step 402, the message queue module classifies and summarizes the monitoring data sent by each resource server, and finally provides the monitoring data for the load control module to read.
In step 403, the load control module periodically sends a monitoring data request message to the message queue module, where the monitoring data request message is used to request to obtain monitoring data.
Correspondingly, in step 404, the message queue module sends a monitoring data response message to the load control module, where the monitoring data response message includes monitoring data requested by the load control module.
The load control module reads the monitoring data of each agent module in the message queue module, performs predictive analysis on the resource load of the resource pool and the actual resource usage amount of the tenant task within a period of time (such as 1 hour) in the future based on the read monitoring data, and performs logic judgment in two aspects based on the prediction result. On one hand, the load control module judges whether to select the task cached in the request queue to execute according to the predicted result. When the predicted load is low, the load control module acquires a scheduling request message (a scheduling request message which is executed or interrupted to execute) from the waiting queue, screens a task which is suitable for being executed, and allocates a computing resource for the task through the scheduler. On the other hand, the load control module also needs to determine whether it needs to interrupt the SLA-insensitive task in operation. When the resource load predicted by a certain resource server is high and the risk that the actual resource usage of a task exceeds the resource capacity of the server exists, the load control model transmits the information of the resource server to the second scheduler and the second scheduler selects to close part of SLA-insensitive tasks on the resource server, so that the residual tasks can acquire sufficient resources during execution. It should be noted that how the load control module specifically performs predictive analysis on the resource load of the resource pool and the actual resource usage amount of the tenant task based on the read monitoring data is not limited in this embodiment of the application, and is not described here again.
In the embodiment of the present application, a flow of creating and closing a first type of task may be as shown in fig. 5. The method comprises the following processing steps:
step 501: the tenant sends a first scheduling request message to a console of the resource scheduling device. Wherein the first scheduling request message is for requesting a first amount of resources for a first type of task. The first scheduling request message may also include identity information of a tenant, information of a corresponding task, and the like, which is not limited in this embodiment and is not described herein again.
Step 502: and the control platform of the resource scheduling equipment verifies the identity information of the tenant and the validity of the first scheduling request message. How to verify the console is specifically, which is not limited in the embodiments of the present application and is not described herein again. After the console passes the authentication, step 503 is executed, otherwise, the tenant's request is rejected, and the authentication is taken as an example to be described below.
Step 503: the console of the resource scheduling device submits a first scheduling request message to the first scheduler.
Step 504: the first scheduler determines a first resource server from a resource pool based on a first amount of resources requested by the first scheduling request message. The first scheduler may select, as the first resource server, a resource server having a larger number of free resources than the first number from among the at least one resource server included in the resource pool.
In order to ensure that the first type of task submitted by the tenant can obtain enough resources, the first scheduler of the public cloud may perform resource scheduling according to the task resource request amount. As shown in FIG. 6, assume that the tenant submits task 1, task 2, and task 3 in order. And the first scheduler schedules resources according to the task resource request quantity, and schedules the task 1 and the task 2 to the resource server 1. When the tenant submits the execution request of the task 3, if the resource server 1 still has the resource which is not actually used, but the remaining resource amount is smaller than the resource request amount of the task 3, the first scheduler will schedule the task 3 to the idle resource server 2.
Step 505: and the first scheduler sends a task creating request to the first resource server, wherein the task creating request is used for requesting to create a task corresponding to the first scheduling request message and requesting to schedule a first amount of resources for the task.
Step 506: an agent module in the first resource server creates a task according to the task creation request and schedules a first quantity of resources.
Step 507: and the first resource server sends a task creation response to the first scheduler, wherein the task creation response is used for indicating a request result of the task creation request.
Step 508: and the first resource server sends a task creation notification message to the console, wherein the task creation notification message is used for indicating a request result of the first scheduling request message.
Step 509: and the console sends a first scheduling response message to the tenant according to the task creation notification message, wherein the first scheduling response message is used for indicating a request result of the first scheduling request message.
Through the process, the resources are scheduled for the tasks corresponding to the first scheduling request message according to the first scheduling request message of the tenant, and the tasks are created.
Further, if it is determined that the resource load rate of the resource pool is greater than or equal to the first threshold, M tasks of the second type may be selected from the plurality of tasks executed by the at least one resource server, and resources occupied by the M tasks of the second type are released, where M is an integer greater than 0. Optionally, the M tasks of the second type may be placed in a waiting queue to wait for subsequent call execution.
Further, in the embodiment of the present application, a first type of task and a second type of task may be executed simultaneously in one resource server. The first scheduler analyzes and predicts the resource load of each resource server in a future period of time through the resource amount used by each task in each resource server, and when the first scheduler determines that the amount of idle resources in the first resource server is smaller than a second threshold value or the load rate of the first resource server is larger than a preset load rate, for example, larger than 90%, at least one task of a second type is selected from the first resource server, and the resources occupied by the at least one task of the second type are released. Therefore, the embodiment of the application schedules the first type of task according to the resource request amount, and interrupts the second type of task on the resource server with higher load in time by monitoring and predicting and analyzing the resource load. The situation that the second type task occupies resources required by the first type task can be avoided, and the influence on the resource use of the first type task is avoided. That is, by the above method, the resource allocation of the first type of task can be preferentially guaranteed.
For example, as shown in FIG. 7, assume that a tenant submits task 1, task 2, and task 3 in order. The task 1 is a first type task, the tasks 2 and 3 are second type tasks, the first scheduler performs resource scheduling according to the task resource request amount, and the task 1 is scheduled to the resource server 1; and the second scheduler performs resource scheduling according to the resource usage of the tasks and schedules the tasks 2 and 3 to the resource server 1. The amount of resources scheduled for task 1 is equal to the amount of resource requests, and the amounts of resources scheduled for tasks 2 and 3 are both smaller than the respective amounts of resource requests. With time, the actual resource usage amounts of task 2 and task 3 gradually increase and reach the respective resource request amounts, and the actual resource usage amounts of the three tasks will exceed the total resource capacity of the resource server 1. At this point, it may be necessary to interrupt the execution of some of the tasks to ensure that other tasks are able to obtain sufficient computing resources. The interruption of the first type of task during its execution would seriously affect the experience of the tenant, resulting in the loss of the operator's brand and public praise. For this purpose, the second type of task may be forcibly interrupted, at which point task 2 and task 3 may all be interrupted from execution and the resources are released. Further, the at least one task of the second type may be put back into the waiting queue to wait for being called to execute again.
Optionally, in this embodiment of the application, the tenant may also actively request to close the task, please refer to fig. 5 above, which may specifically refer to the following flow:
step 510: and the tenant sends a task closing request to the console, wherein the task closing request is used for requesting to close the task corresponding to the first scheduling request message.
Step 511: and the console forwards the task closing request to the first resource server.
Step 512: and the first resource server closes the task corresponding to the first scheduling request message according to the task closing request and releases the resources scheduled to the task.
Step 513: and the first resource server sends a task closing completion notification message to the console, wherein the task closing completion notification message is used for indicating the result of task closing.
Step 514: and the console transmits a task closing completion notification message to the tenant.
In the embodiment of the present application, unlike the first type of task, the scheduling request message of the second type of task does not immediately obtain the requested resource, but needs to be queued. Specifically, the flow of creation and interruption of the second type of task may be as shown in fig. 8. The method specifically comprises the following steps:
step 801: and the tenant sends a second scheduling request message to the console of the resource scheduling device. Wherein the second scheduling request message is for requesting a second number of resources for a second type of task. The second scheduling request message may also include identity information of the tenant, information of the corresponding task, and the like, which is not limited in this embodiment and is not described herein again.
Step 802: and the control platform of the resource scheduling equipment verifies the identity information of the tenant and the validity of the second scheduling request message. How to verify the console is specifically, which is not limited in the embodiments of the present application and is not described herein again. After the console passes the authentication, step 803 is executed, otherwise, the tenant's request is rejected, which is described below by taking the authentication as an example.
Step 803: the console places the second scheduling request message into a waiting queue.
Step 804: and the console sends a queuing notification message to the tenant, wherein the queuing notification message is used for indicating that the second scheduling request message is positioned in a waiting queue.
Step 805: and the load control module sends a queuing information request message, wherein the queuing information request message is used for requesting to acquire all queuing scheduling request messages in a waiting queue.
Step 806: and the load control module receives a queuing information response message, wherein the queuing information response message comprises information such as all queuing scheduling request messages in a waiting queue.
Step 807: and the load control module determines task scheduling resources corresponding to the second scheduling request message. It should be noted that, when the load control module predicts that the resource load rate of the resource pool is smaller than the first threshold, the scheduling request messages queued in the waiting queue will be screened. And then submitting the screened scheduling request message and the load rate in each resource server as a task scheduling request to a second scheduler.
For example, if it is determined that the resource load rate of the resource pool is smaller than the first threshold, and the task corresponding to the second scheduling request message is: and determining that the task with the minimum resource quantity or the task with the longest waiting time is the task scheduling resource corresponding to the second scheduling request message.
Step 808: the load control module sends a second scheduling request message to the second scheduler.
Step 809: the second scheduler determines a second resource server from the resource pool based on a second amount of resources requested by the second scheduling request message. The second scheduler may select, as the second resource server, a resource server having a larger number of free resources than a third number from among the at least one resource server included in the resource pool.
In this embodiment of the application, the third quantity may be an actual resource usage amount of a task corresponding to the second scheduling request message, or may be a product of the second quantity and a budget weight value, where the preset weight value is a number greater than 0 and less than or equal to 1.
In order to ensure that the first type of task submitted by the tenant can obtain enough resources, and the resource utilization rate is improved. The second scheduler of the public cloud may perform resource scheduling according to the task resource usage. As shown in FIG. 9, assume that the tenant submits task 1, task 2, and task 3 in order. The task 1 is a first type task, the tasks 2 and 3 are second type tasks, the first scheduler performs resource scheduling according to the task resource request amount, and the task 1 is scheduled to the resource server 1; and the second scheduler performs resource scheduling according to the task resource usage amount and schedules the task 2 to the resource server 1. When the tenant submits the execution request of the task 3, if resources which are not actually used are still left on the resource server 1 but the remaining resource amount is smaller than the resource request amount of the task 3, the second scheduler still can schedule the task 3 into the resource server 1 in order to improve the resource utilization rate, so that resource fragments (resources which cannot be allocated to the tenant cloud host) in each resource server in the cloud data center can be effectively avoided, and higher resource utilization rate is ensured.
Step 810: and the second scheduler sends a task creating request to the second resource server, wherein the task creating request is used for requesting to create a task corresponding to the second scheduling request message and requesting to schedule a third amount of resources for the task.
Step 811: and the agent module in the second resource server creates a task according to the task creation request and schedules a third amount of resources.
Step 812: and the second resource server sends a task creation response to the second scheduler, wherein the task creation response is used for indicating a request result of the task creation request.
Step 813: and the second resource server sends a task creation notification message to the console, wherein the task creation notification message is used for indicating a request result of the second scheduling request message.
Step 814: and the console sends a second scheduling response message to the tenant according to the task creation notification message, wherein the second scheduling response message is used for indicating a request result of the second scheduling request message.
Through the process, the resources are scheduled for the tasks corresponding to the second scheduling request message according to the second scheduling request message of the tenant, and the tasks are created.
Further, in the embodiment of the present application, a first type of task and a second type of task may be executed simultaneously in one resource server. If the number of the idle resources in the second resource server is smaller than a second threshold value, selecting N tasks of the second type from a plurality of tasks executed by the second resource server, and releasing resources occupied by the N tasks of the second type, wherein N is an integer larger than 0. Optionally, the N tasks of the second type may also be placed in a waiting queue to wait for subsequent execution of a call. Specifically, as shown in fig. 8, the following process may be further included.
Step 815: and when the load control module predicts that the number of the idle resources in the second resource server is smaller than a second threshold value, the load control module sends a resource release request to the second scheduler. The resource release request is used for requesting to release part of resources.
Step 816: the second scheduler determines at least one task of a second type that needs to interrupt execution. The second scheduler may determine the M second type tasks with the largest resource usage as the tasks that need to be interrupted, or may determine the tasks that need to be interrupted according to another method. The following description takes an example in which the second scheduler determines to interrupt the task corresponding to the second scheduling request message, and other cases are not described again.
Step 817: and the second scheduler sends a task interruption request to the second resource server, wherein the task interruption request is used for requesting to interrupt the execution of the task corresponding to the second scheduling request message and release the corresponding resource.
Step 818: and the second resource server interrupts the execution of the task corresponding to the second scheduling request message according to the task interruption request and releases the resources scheduled to the task.
Step 819: and the second resource server sends a task interruption response to the second scheduler, wherein the task interruption response is used for indicating the result of the task interruption.
Step 820: and the second resource server sends a task interruption notification message to the console, wherein the task interruption notification message is used for indicating that the task corresponding to the second scheduling request message is executed in an interruption manner.
Step 821: and the console forwards the task interruption notification message to the tenant.
By the method, according to the difference of tenant tasks in a public cloud environment, the SLA sensitive tasks are preferentially scheduled according to the resource request amount, and each SLA sensitive task can obtain enough resources in the execution process. And for the tasks insensitive to the SLA, a scheduling method according to the actual usage amount of the resources is adopted, the creation and interruption of the SLA insensitive tasks are dynamically controlled through load monitoring and prediction while the resources requested by the SLA sensitive tasks but not used are fully used, and the influence on the resource usage of the SLA sensitive tasks is avoided.
Fig. 10 is a schematic structural diagram of a resource scheduling apparatus according to an embodiment of the present application. The apparatus 1000 comprises: a communications module 1001, a processor 1002, a memory 1003, and the like.
The communication module 1001 in this embodiment may be a communication chip with wired or wireless communication capability, such as a radio frequency transceiver, a network cable interface, and the like, and is configured to execute the processing of acquiring the first scheduling request message and the second scheduling request message in the above method flow.
The processor 1002 in the present embodiment may be an integrated circuit chip having signal processing capability. In implementation, the steps of the above method embodiments may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed by the embodiment of the application can be combined to determine whether to schedule the resource server, schedule corresponding resources for the corresponding scheduling request message in the determined resource server, and the like, and also select the task which needs to release the resources, release the occupied server resources for the selected task, and the like.
The memory 1003 in the embodiment of the present application may be a random access memory, a flash memory, a read only memory, a programmable read only memory, or an electrically erasable programmable memory, a register, or other storage media that are well known in the art. The processor 1002 reads the information in the memory 1003 and, in combination with its hardware, may perform the steps of the above method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.