CN107481260A - A kind of region crowd is detained detection method, device and storage medium - Google Patents
A kind of region crowd is detained detection method, device and storage medium Download PDFInfo
- Publication number
- CN107481260A CN107481260A CN201710482241.8A CN201710482241A CN107481260A CN 107481260 A CN107481260 A CN 107481260A CN 201710482241 A CN201710482241 A CN 201710482241A CN 107481260 A CN107481260 A CN 107481260A
- Authority
- CN
- China
- Prior art keywords
- mrow
- crowd
- msub
- key point
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/215—Motion-based segmentation
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/207—Analysis of motion for motion estimation over a hierarchy of resolutions
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
 
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
Being detained detection method, device and storage medium, wherein this method the invention discloses a kind of region crowd includes:Obtain crowd's prospect in scene image;The key point of acquisition crowd's prospect;Key point is tracked, obtains the movement locus of key point;According to the move distance of moving track calculation key point;If the movement velocity of key point is less than predetermined threshold value, by key point labeled as delay key point, and delay dot matrix is generated;It is updated according to dot matrix is detained to being detained statistical matrix.By foreground segmentation, the key point of crowd's prospect is obtained, and by the tracking of the key point to the crowd region in scene, considerably increases the robustness of human body tracking, will not be influenceed by crowd's adhesion;Then the motor pattern in scene using key point as the crowd region of representative is analyzed, the estimation of crowd's holdup time is carried out by being detained statistical matrix, it is achieved thereby that being monitored analysis to crowd's delay situation of whole scene.
    Description
Technical Field
      The invention relates to a computer vision technology, in particular to a regional crowd retention detection method, a device and a storage medium.
    Background
      Regional crowd behavior analysis is an important application of computer vision and image analysis technology in the field of video monitoring. The general method is to collect real-time video pictures of public safety areas such as parks, squares, large indoor activity places and the like through a camera, analyze and predict the number and the movement trend of people in the scene by applying an image intelligent analysis technology, and make certain decisions according to the analysis result so as to prevent people group accidents such as certain advocacy, treading and the like.
      The invention mainly carries out detection and analysis aiming at the action of crowd detention in a scene, and counts the detention time of the crowd in each area in a picture, thereby effectively detecting the abnormal actions of crowd detention, wandering and the like in sensitive areas, such as railway stations, airports, subway entrances and exits, passages, stairs or military confidential areas. Meanwhile, by analyzing the length of the residence time of the crowd in each area in the scene, people stream evacuation planning can be performed in the current scene when some dangers occur.
      The retention of people in the traditional regional people behavior analysis is usually realized by tracking based on a foreground moving target, and the methods generally count the retention time of a target in a specified region by performing background modeling and segmentation of the moving target and then tracking the moving target. The method for detecting the moving target by using the background modeling has no distinction on the moving target, cannot distinguish whether the moving target is a person, a vehicle or other targets, and performs simple screening by combining partial algorithms with the size, so that the method is easily interfered by the angle of a camera and the size of a scene, and is easy to perform false detection or missing detection. In addition, with the rise of deep learning in recent years, target detection methods are improved by a plurality of methods, and target detection based on a background model is replaced by human detection based on deep learning, so that partial problems are solved, but the human detection based on deep learning cannot adapt to a larger scene and the problems of more people in the scene and serious adhesion. Moreover, the existing moving target tracking algorithm has higher requirement on the independence of the target, and the effect is often poorer when multiple targets are adhered together. In addition, the detention of the target is judged by analyzing the track of the moving target, only the detention or the rest of the crowd can be detected, and if the target moves in a small range, the detection is easy to miss.
      In summary, the existing conventional crowd retention detection scheme has the following disadvantages: on one hand, the method is limited by the accuracy of foreground moving target segmentation, on the other hand, the method is also easily influenced by crowd adhesion in the aspect of tracking, is more easily interfered by complex background, weather factors and the like in practical application, and is very easy to generate false alarm and false negative alarm.
    Disclosure of Invention
      In order to overcome the defects of the prior art, one of the objectives of the present invention is to provide a method for detecting crowd retention in an area, which can solve the problems that the existing crowd retention detection scheme is limited by the accuracy of foreground moving object segmentation, and is also very susceptible to crowd adhesion in tracking, and is more susceptible to interference of complex background and weather factors in practical application, and is very prone to false alarm and false alarm.
      The invention also aims to provide a regional crowd retention detection device, which can solve the problems that the existing crowd retention detection scheme is limited by the accuracy of foreground moving target segmentation on one hand, and is also very easily influenced by crowd adhesion on the other hand, and is more easily interfered by complex background, weather factors and the like in practical application, and false alarm are very easily generated.
      The invention also aims to provide a storage medium which stores a computer program and can solve the problems that the existing crowd retention detection scheme is limited by the accuracy of the segmentation of the foreground moving target on one hand, is very easily influenced by crowd adhesion on the other hand, is more easily interfered by a complex background, weather factors and the like in practical application and is very easily subjected to false alarm and false alarm.
      One of the purposes of the invention is realized by adopting the following technical scheme:
      a regional population retention detection method comprises the following steps:
      acquiring a crowd foreground in the scene image;
      obtaining key points of the crowd prospect;
      tracking the key points to obtain the motion trail of the key points;
      calculating the movement distance of the key point according to the movement track;
      if the movement speed of the key point is smaller than a preset threshold value, marking the key point as a stagnation key point and generating a stagnation point matrix;
      and updating the retention statistical matrix according to the retention point matrix.
      Further, the regional population retention detection method further comprises the following steps: calculating a perspective matrix of the scene image;
      the preset threshold is specifically calculated according to the perspective matrix;
      if the movement speed of the key point is less than a preset threshold value, marking the key point as a stagnation key point, and after generating a stagnation point matrix, further comprising the following steps:
      and performing Gaussian filtering on the stagnation point matrix, and obtaining the standard deviation of a filtering kernel of the Gaussian filtering according to the perspective matrix.
      Further, the key point for obtaining the foreground of the crowd specifically includes the following sub-steps:
      calculating the gradient I of the foreground of the crowd in the X and Y directionsxAnd Iy:
      Calculating the gradient IxAnd IyThe product of (a):
      Ixy=Ix·Iy;
      to pairAnd IxyAnd performing Gaussian weighting to generate elements A, B and C of a weighting matrix M:
      calculating Harris response of each pixel in the foreground of the crowd according to the weighting matrix M;
      and performing non-maximum suppression on each pixel point according to the Harris response, and selecting a local maximum point as a key point of the crowd prospect.
      Further, the tracking the key points by a sparse optical flow method to obtain the motion trajectory of the key points specifically includes the following sub-steps:
      acquiring a neighborhood of the key point, and generating a constraint equation of an ith pixel Ii in the neighborhood as follows:
      wherein,u, v represent the x component, y component of the key point instantaneous velocity;
      solving an objective function by least squaresObtaining the key point instantaneous speed U ═ U, v;
      and calculating the motion trail of the key point according to the instantaneous speed.
      Further, the regional population retention detection method further comprises the following steps:
      if the tracked key points are lost, selecting new key points of the current scene image for supplement;
      specifically, the method comprises the following substeps:
      respectively calculating the sum of the distances between the new key point and the tracked key point;
      and selecting the new key point with the maximum sum of the distances for supplement.
      Further, after the retention statistical matrix is updated according to the retention point matrix, the method further includes the following steps:
      and carrying out crowd retention visualization on the scene image according to the updated retention statistical matrix.
      The second purpose of the invention is realized by adopting the following technical scheme:
      a regional crowd retention detection apparatus comprising:
      the first acquisition module is used for acquiring a crowd foreground in the scene image;
      the second acquisition module is used for acquiring key points of the crowd foreground;
      the tracking module is used for tracking the key points to obtain the motion trail of the key points;
      the first calculation module is used for calculating the movement distance of the key point according to the movement track;
      the marking module is used for marking the key points as retention key points and generating a retention point matrix if the movement speed of the key points is smaller than a preset threshold;
      and the updating module is used for updating the retention statistical matrix according to the retention point matrix.
      Further, the regional crowd stays detection device still includes:
      the second calculation module is used for calculating a perspective matrix of the scene image;
      and the filtering module is used for carrying out Gaussian filtering on the stagnation point matrix, and the standard deviation of a filtering kernel of the Gaussian filtering is obtained according to the perspective matrix.
      Further, the regional crowd stays detection device still includes:
      and the supplement module is used for selecting a new key point of the current scene image for supplement if the tracked key point is lost.
      The third purpose of the invention is realized by adopting the following technical scheme:
      an area crowd retention detection apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the aforementioned area crowd retention detection method when executing the computer program.
      Compared with the prior art, the invention has the beneficial effects that: by segmenting the foreground, acquiring key points of the crowd foreground, and using the sparse optical flow method for tracking the key points of the crowd area in the scene to perform motion estimation instead of tracking a single target, the robustness of human tracking is greatly increased, and the influence of crowd adhesion is avoided; then the motion mode of the crowd area represented by the key points in the scene is analyzed, the crowd detention time is estimated through the detention statistical matrix, the detention area and the corresponding detention time can be conveniently calculated, and therefore the crowd detention condition of the whole scene is monitored and analyzed.
    Drawings
      Fig. 1 is a schematic flow chart of a region crowd retention detection method according to a first embodiment of the present invention;
      FIG. 2 is a schematic structural diagram of an FCN model for crowd prospect segmentation;
      FIG. 3 is a schematic illustration of marking a pedestrian;
      FIG. 4 is a schematic view of a scene image crowd retention visualization;
      fig. 5 is a schematic structural diagram of a region crowd retention detection device according to a second embodiment of the invention;
      fig. 6 is a schematic structural diagram of a region crowd stagnation detection device according to a third embodiment of the present invention.
    Detailed Description
      The present invention will be further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the embodiments or technical features described below can be used to form a new embodiment without conflict.
    Example one
      Fig. 1 shows a method for detecting the retention of regional people, which includes the following steps:
      and S110, acquiring the crowd foreground in the scene image.
      Specifically, the crowd detection technology based on deep learning can be used, the accuracy of crowd foreground detection is improved, and the method can be well applied to scenes with large application range and more people.
      In this step, a fully-convoluted convolutional neural network may be used to segment the crowd foreground for the video frames in the monitored scene. The method has the advantages that the segmentation of the foreground of the people only depends on the information of the current frame image, and no requirement can be made on the erection mode of the camera.
      The full Convolutional neural Network FCN, i.e., a full Convolutional neural Network, is a Convolutional neural Network that does not include a full connection layer. The convolutional neural network is different from a convolutional neural network in that the general convolutional neural network can be generally divided into a plurality of layers, the characteristics of different layers in a corresponding learning image are obtained, a shallower convolutional layer sensing domain is smaller, and the characteristics of some local regions are learned; deeper convolutional layers have larger sensing areas, more abstract features can be learned, and finally, the features are unfolded and classified by utilizing full connection layers. In the FCN network architecture, however, the full link layer is replaced with a convolution kernel with a kernel of 1 × 1, so that the spatial relationship of the features is still preserved in the output layer.
      An FCN model for crowd-sourcing foreground segmentation is shown in fig. 2. Wherein, the input of the network is the video frame F at the t moment when the foreground segmentation is requiredtThe output is the result S of the corresponding foreground segmentationtIn which S istSatisfies the following conditions:
      and step S120, obtaining key points of the crowd foreground.
      In this step, the motion of the foreground of the crowd is analyzed, and the motion of the crowd area near some key points on the foreground of the crowd is evaluated by tracking the key points. In this embodiment, Harris is used to detect keypoints and the segmented image of the crowd foreground is used to filter points in non-crowd areas.
      Further, the step S120 of obtaining the key points of the crowd foreground specifically includes the following sub-steps:
      step S121, calculating the crowd foreground Ft(X, Y) gradient I in X and Y directionsxAnd Iy:
      Step S122, calculating gradient IxAnd IyThe product of (a):
      Ixy=Ix·Iy。
      step S123, using Gaussian function pairAnd IxyGaussian weighting is performed, and σ may be 1, and elements a, B, and C of the weighting matrix M are generated:
      and step S124, calculating Harris response R of each pixel in the crowd foreground according to the weighting matrix M. And can further set zero to Harris response R is less than threshold value t or the R of the pixel point on the crowd's prospect area, namely:
      R(i,j)={R(i,j)∶detM=α(traceM)2<t or St(i,j)=0}。
      And S125, performing non-maximum suppression on each pixel point according to the Harris response, and selecting a local maximum point as a key point of the crowd prospect.
      Non-maximum suppression can be performed in the neighborhood of 5 × 5 of the pixel point, the local maximum point is the corner point in the image, and the corner points are the key points C1,C2,Cn。
      And S130, tracking the key points to obtain the motion tracks of the key points.
      Further, the keypoints are tracked by a sparse optical flow method.
      The optical flow represents the instantaneous speed of pixel motion between image sequences, and in the invention, the optical flow of all pixels in the image does not need to be calculated, and only the key point of the crowd foreground needs to be calculated, so that the calculation speed of the optical flow can be greatly improved.
      Further, step S130 tracks the key points by a sparse optical flow method to obtain the motion trajectories of the key points, and specifically includes the following sub-steps:
      and S131, acquiring a neighborhood of the key points.
      For any key point (x, y), an n × n small domain Ω is built around it, and the optical flow U ═ U, v is assumed to remain unchanged in this domain. For each pixel in the neighborhood Ω, we can write a constraint equation.
      Step S132, generating a constraint equation of the ith pixel Ii in the neighborhood as:
      wherein,representing the partial derivative of image gray to space;representing a partial derivative of image gray scale over time; u, v represent the x-component, y-component of the keypoint instantaneous velocity.
      Step S133, solving an objective function by a least square methodAnd obtaining the key point instantaneous speed U-U (U, v).
      And S134, calculating the motion trail of the key point according to the instantaneous speed.
      Under the condition of knowing the instantaneous speed of the key point, the key point can be tracked, the motion trail of the key point is calculated, and the position of the key point in each frame of image is obtained.
      When we track key points on the crowd foreground, at t0At that moment, we first choose N (e.g., 1000) key points, denoted as { P }1,P2,P3,…,PNAt the next t1,t2,…tnAt all times, we keep track of these key points, every timeAll points record their past historical coordinate traces, which we use hereRepresents the ith key point from t0Time tnHistory of time of day whereinRepresents tiThe coordinate position of the temporal key point in the image.
      And step S140, calculating the movement distance of the key point according to the movement track.
      For a keypoint { P over the foreground of all people in tracking state1,P2,P3,…,PNAnd 5, analyzing the motion state of each key point in detail. Since the key points are positioned in the crowd area, the motion state of the key points reflects the motion state of the crowd in the nearby area omega to a certain extent.
      For the ith keypoint PiAt a time from t0To tnThe motion track of the moment is Ti={p0,p1,…pt…pnIn which p istRepresenting a key point P at time tiWhere in the image. We are right to the key point PiDistance of movement over a period of time, e.g. 1 secondThe calculation is carried out by the following method:
      where r is the frame rate of the video (frames per second), K is tnThe sequence number of the frame one second before the time. Dist (p)t,pt-1) Representing p in the image planet,pt-1The euclidean distance between two points. Distance of movement of key pointsIs actually piAverage distance moved in the past second.
      And S150, if the movement speed of the key point is smaller than a preset threshold value, marking the key point as a stagnation key point and generating a stagnation point matrix.
      By moving distance to key pointTo determine the key point piWhether or not the state is retained within the past second. The determination method is that ifAnd gamma is a preset threshold value, and the key point is in a static state within one second in the past, namely the key point is a retention key point.
      Can use the stagnation point matrixTo represent tiAt this time, the key point in the retention state in the image:
      further, the preset threshold may be calculated from a perspective matrix of the scene image, such asWhere Pmap is the perspective matrix of the scene image. The perspective relation of the scene image is introduced, so that the real situation of the crowd movement can be restored, and the detention judgment is more accurate.
      Therefore, the regional population retention detection method further comprises the following steps:
      and step S101, calculating a perspective matrix of the scene image.
      The method for detecting people retention in a region needs to estimate the perspective relation of a scene shot by a camera so as to estimate the relation between the moving distance of people and the real distance in the scene.
      Firstly, initializing a perspective matrix PMap, wherein the size of the perspective matrix is consistent with the size of a video picture acquired by a camera; PMap (x, y) represents the height at image (x, y) that 1 pixel represents in the real scene. The estimation of the scene perspective relation in the invention is an approximate calculation method, only the proportional relation in the vertical aspect of the image is considered, and the condition that all people in the camera picture are 1.7m is assumed.
      The image of a frame is selected in the shot video picture, at least more than 2 pedestrians with different distances are required in the picture, and the larger the number, the better the number. As shown in FIG. 3, the pedestrians are marked and their vertical y-coordinates of their head and feet are recorded asWhere K is the number of all people marked.
      Thus, the height of a pixel occupied by a person with a height of 1.7m in the image under different y coordinates from far to near is obtained, and the mapping relation between the y coordinate of the image and the height h of the person in the image is expressed by a linear function h which is w.y + b. And w and b are parameters required by us.
      Recording:
      then:
      finally, we can solve the values of the elements in PMap according to a linear mapping function:
      PMap(i,j)=w·j+b。
      further, if the motion speed of the key point is less than the preset threshold in step S150, marking the key point as a stagnation key point, and after generating a stagnation point matrix, the method further includes the following steps:
      and S102, performing Gaussian filtering on the stagnation point matrix, and obtaining the standard deviation of a filtering kernel of the Gaussian filtering according to the perspective matrix.
      Stagnation point matrix for i timeGaussian filtering is performed, and the standard deviation σ of the filter kernel is related to the perspective relationship at each position of the image, i.e., the magnitude of the value of the corresponding element in PMap. By expressing the standard deviation of the filter kernel at position p in the image by σ (p), it is possible to useMatrix of stagnation points after Gaussian filteringComprises the following steps:
      stagnation point matrixThe value in (1) represents whether the crowd in the position area picture has a stay state. If the value is 0, it indicates that there is no crowd in the area or that the crowd is in motion.
      And step S160, updating the retention statistical matrix according to the retention point matrix.
      At t0At time, we initialize a value retention statistics matrixAnd S, the size of the matrix is the same as the size of the image, and the element S (i, j) in the matrix represents the length of the stay time of the crowd at the position of (i, j) in the image, and the unit is second. If no people remain at the position, S (i, j) is 0.
      At the next time, the matrix of stagnation points is superimposed one time on the matrix of stagnation statistics SOr gaussian filtered stagnation point matrixBy superimposing a matrix of Gaussian filtered stagnation pointsFor example, the retention statistics matrix is updated as follows:
      wherein lambda is a decay factor, and the higher lambda is, the more sensitive is the human motion, and the less prone is the retention statistics. λ may be formulated by empirical values.
      Through the steps, for each frame of video image, the influence of the motion of the key point on the retention statistical matrix S at the current moment is analyzedAnd then updates S. After S exists, the state of people staying at any position in the image can be easily analyzed.
      According to the method, the key points of the crowd foreground are obtained through foreground segmentation, and the sparse optical flow method is used for tracking the key points of the crowd area in the scene to perform motion estimation instead of tracking a single target, so that the robustness of human body tracking is greatly improved, and the influence of crowd adhesion is avoided; then the motion mode of the crowd area represented by the key points in the scene is analyzed, the crowd detention time is estimated through the detention statistical matrix, the detention area and the corresponding detention time can be conveniently calculated, and therefore the crowd detention condition of the whole scene is monitored and analyzed.
      As a further improvement of the present invention, after the step S160 updates the retention statistic matrix according to the retention point matrix, the method further includes the following steps:
      and S170, carrying out crowd retention visualization on the scene image according to the updated retention statistical matrix. As shown in fig. 4, the statistics of the retention statistics matrix is visually superimposed on the video image, so that the comparison of retention conditions of people in different areas of the image can be conveniently observed.
      In this embodiment, the retention statistics matrix is mapped to a flame color that gradually changes from blue to red, where blue represents no retention of the population and red represents a longer retention time.
      Firstly, an index table is established, 256 values of 0-255 are respectively mapped with RGB24 color, and for any value G in a statistical matrix, the following formula is used for mapping into a three-channel color value, and the mapping relation is as follows:
      wherein L is a mapping index table.
      Applying the above mapping to all elements in the retention statistics matrix S results in a color thermodynamic diagram H.
      The thermodynamic diagram H is then superimposed on the original diagram I to generate a retention stain diagram R, as shown in fig. 4. So as to more intuitively show the retention degree of people in different areas in the original video frame. The superposition uses the following formula:
      R=I+(H-B)
      where B is a color image of the same size as the video image with a blue background in order to make the region where no people remain transparent.
      As a further improvement of the invention, the regional population retention detection method further comprises the following steps:
      and step S180, if the tracked key points are lost, selecting new key points of the current scene image for supplement.
      As the tracking is performed, people in the scene may leave the scene or generate occlusion, so some key points may be lost, when the tracking is performed for a period of time, the number of lost key points is increased, and the key points on people newly entering the scene are not within the tracking range of the people. Therefore, we need to select a new key point of the current scene image for supplementation, and select one key point from the key points detected by the current frame for supplementation.
      Further, if there is a loss of the tracked key point in step S180, a new key point of the current scene image is selected for supplementation, which includes the following substeps:
      step S181, respectively calculating the sum of the distances between the new key point and the tracked key points;
      and S182, selecting the new key point with the maximum sum of the distances for supplement.
      An example of a selection strategy is as follows:
      1. for newly detected keypoints, we use { Q }1,Q2,Q3,…QNExpressing that, we calculate the distance between each new key point and the key point in the tracking state at present, and get a distance matrix DN×NWherein D (i, j) ═ Dist (P)i,Qj) Wherein Dist (p)1,p2) Representing p in the image plane1,p2The euclidean distance between the two points.
      2. Aligning each column element in matrix DAdding to obtain a row vector W with length N1×N。
      3. The position index k of the maximum value in the vector W is found to be argmax W.
      QkI.e. the new key point we have chosen.
      Through the step S180, the key points on the foreground of the N crowds are kept to be tracked and analyzed at any time, and a good crowd retention detection effect is guaranteed.
    Example two
      The regional crowd retention detection apparatus shown in fig. 5 includes:
      a first obtaining module 110, configured to obtain a crowd foreground in the scene image;
      a second obtaining module 120, configured to obtain a key point of the crowd foreground;
      a tracking module 130, configured to track the key points and obtain motion trajectories of the key points;
      a first calculating module 140, configured to calculate a movement distance of the key point according to the movement trajectory;
      the marking module 150 is configured to mark the key point as a stagnation key point and generate a stagnation point matrix if the movement speed of the key point is smaller than a preset threshold;
      and an updating module 160, configured to update the retention statistics matrix according to the retention point matrix.
      Further, the regional crowd stays detection device still includes:
      a second calculation module 101, configured to calculate a perspective matrix of the scene image;
      and the filtering module 102 is configured to perform gaussian filtering on the stagnation point matrix, and a standard deviation of a filtering kernel of the gaussian filtering is obtained according to the perspective matrix.
      Further, the regional crowd stays detection device still includes:
      and the visualization module 170 is configured to perform crowd retention visualization on the scene image according to the updated retention statistical matrix.
      Further, the regional crowd stays detection device still includes:
      and the supplementing module 180 is configured to select a new key point of the current scene image for supplementing if the tracked key point is lost.
      Specifically, the supplement module 180 includes:
      a calculating unit 181, configured to calculate a sum of distances between the new key point and the tracked key points, respectively;
      and the selecting unit 182 is configured to select a new key point with the largest sum of the distances to supplement.
      The apparatus in this embodiment and the method in the foregoing embodiments are based on two aspects of the same inventive concept, and the method implementation process has been described in detail in the foregoing, so that those skilled in the art can clearly understand the structure and implementation process of the system in this embodiment according to the foregoing description, and for the sake of brevity of the description, details are not repeated here.
      For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the various modules may be implemented in the same one or more software and/or hardware implementations of the invention.
      From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. With such an understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments or some parts of the embodiments, such as:
      a storage medium storing a computer program which, when executed by a processor, implements the steps of the aforementioned regional population entrapment detection method.
      The described embodiments of the apparatus are merely illustrative, wherein the modules or units described as separate parts may or may not be physically separate, and the parts illustrated as modules or units may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
      The invention is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like, as in embodiment four.
    EXAMPLE III
      The device for detecting the retention of people in the area as shown in fig. 6 comprises a memory 200, a processor 300 and a computer program stored in the memory 200 and executable on the processor 300, wherein the processor 300 implements the steps of the method for detecting the retention of people in the area when executing the computer program.
      The apparatus in this embodiment and the method in the foregoing embodiments are based on two aspects of the same inventive concept, and the method implementation process has been described in detail in the foregoing, so that those skilled in the art can clearly understand the structure and implementation process of the system in this embodiment according to the foregoing description, and for the sake of brevity of the description, details are not repeated here.
      The device for detecting the retention of the crowd in the area, provided by the embodiment of the invention, can be used for obtaining key points of the crowd foreground through foreground segmentation, and performing motion estimation by using a sparse optical flow method for tracking the key points of the crowd area in a scene instead of tracking a single target, so that the robustness of human body tracking is greatly increased, and the device is not influenced by crowd adhesion; then the motion mode of the crowd area represented by the key points in the scene is analyzed, the crowd detention time is estimated through the detention statistical matrix, the detention area and the corresponding detention time can be conveniently calculated, and therefore the crowd detention condition of the whole scene is monitored and analyzed.
      The above embodiments are only preferred embodiments of the present invention, and the protection scope of the present invention is not limited thereby, and any insubstantial changes and substitutions made by those skilled in the art based on the present invention are within the protection scope of the present invention.
    Claims (10)
1. A regional population retention detection method is characterized by comprising the following steps:
      acquiring a crowd foreground in the scene image;
      obtaining key points of the crowd prospect;
      tracking the key points to obtain the motion trail of the key points;
      calculating the movement distance of the key point according to the movement track;
      if the movement speed of the key point is smaller than a preset threshold value, marking the key point as a stagnation key point and generating a stagnation point matrix;
      and updating the retention statistical matrix according to the retention point matrix.
    2. The method of regional population retention detection according to claim 1, further comprising the steps of: calculating a perspective matrix of the scene image;
      the preset threshold is specifically calculated according to the perspective matrix;
      if the movement speed of the key point is less than a preset threshold value, marking the key point as a stagnation key point, and after generating a stagnation point matrix, further comprising the following steps:
      and performing Gaussian filtering on the stagnation point matrix, and obtaining the standard deviation of a filtering kernel of the Gaussian filtering according to the perspective matrix.
    3. The method for detecting crowd retention in a region according to claim 1, wherein the step of obtaining the key points of the crowd foreground comprises the following sub-steps:
      calculating the gradient I of the foreground of the crowd in the X and Y directionsxAnd Iy:
      <mrow>    <msub>      <mi>I</mi>      <mi>x</mi>    </msub>    <mo>=</mo>    <mfrac>      <mrow>        <mo>&part;</mo>        <mi>F</mi>      </mrow>      <mrow>        <mo>&part;</mo>        <mi>x</mi>      </mrow>    </mfrac>    <mo>=</mo>    <mi>F</mi>    <mo>&CircleTimes;</mo>    <mfenced open = "(" close = ")">      <mtable>        <mtr>          <mtd>            <mrow>              <mo>-</mo>              <mn>1</mn>            </mrow>          </mtd>          <mtd>            <mn>0</mn>          </mtd>          <mtd>            <mn>1</mn>          </mtd>        </mtr>      </mtable>    </mfenced>  </mrow> 
      <mrow>    <msub>      <mi>I</mi>      <mi>y</mi>    </msub>    <mo>=</mo>    <mfrac>      <mrow>        <mo>&part;</mo>        <mi>F</mi>      </mrow>      <mrow>        <mo>&part;</mo>        <mi>y</mi>      </mrow>    </mfrac>    <mo>=</mo>    <mi>F</mi>    <mo>&CircleTimes;</mo>    <msup>      <mfenced open = "(" close = ")">        <mtable>          <mtr>            <mtd>              <mrow>                <mo>-</mo>                <mn>1</mn>              </mrow>            </mtd>            <mtd>              <mn>0</mn>            </mtd>            <mtd>              <mn>1</mn>            </mtd>          </mtr>        </mtable>      </mfenced>      <mi>T</mi>    </msup>    <mo>;</mo>  </mrow> 
      Calculating the gradient IxAnd IyThe product of (a):
      <mrow>    <msubsup>      <mi>I</mi>      <mi>x</mi>      <mn>2</mn>    </msubsup>    <mo>=</mo>    <msub>      <mi>I</mi>      <mi>x</mi>    </msub>    <mo>&CenterDot;</mo>    <msub>      <mi>I</mi>      <mi>x</mi>    </msub>  </mrow> 
      <mrow>    <msubsup>      <mi>I</mi>      <mi>y</mi>      <mn>2</mn>    </msubsup>    <mo>=</mo>    <msub>      <mi>I</mi>      <mi>y</mi>    </msub>    <mo>&CenterDot;</mo>    <msub>      <mi>I</mi>      <mi>y</mi>    </msub>  </mrow> 
      <mrow>    <msub>      <mi>I</mi>      <mrow>        <mi>x</mi>        <mi>y</mi>      </mrow>    </msub>    <mo>=</mo>    <msub>      <mi>I</mi>      <mi>x</mi>    </msub>    <mo>&CenterDot;</mo>    <msub>      <mi>I</mi>      <mi>y</mi>    </msub>    <mo>;</mo>  </mrow> 
      to pairAnd IxyAnd performing Gaussian weighting to generate elements A, B and C of a weighting matrix M:
      <mrow>    <mi>A</mi>    <mo>=</mo>    <mi>g</mi>    <mrow>      <mo>(</mo>      <msubsup>        <mi>I</mi>        <mi>x</mi>        <mn>2</mn>      </msubsup>      <mo>)</mo>    </mrow>    <mo>=</mo>    <msubsup>      <mi>I</mi>      <mi>x</mi>      <mn>2</mn>    </msubsup>    <mo>&CircleTimes;</mo>    <mi>w</mi>  </mrow> 
      <mrow>    <mi>B</mi>    <mo>=</mo>    <mi>g</mi>    <mrow>      <mo>(</mo>      <msubsup>        <mi>I</mi>        <mi>y</mi>        <mn>2</mn>      </msubsup>      <mo>)</mo>    </mrow>    <mo>=</mo>    <msubsup>      <mi>I</mi>      <mi>y</mi>      <mn>2</mn>    </msubsup>    <mo>&CircleTimes;</mo>    <mi>w</mi>  </mrow> 
      <mrow>    <mi>C</mi>    <mo>=</mo>    <mi>g</mi>    <mrow>      <mo>(</mo>      <msub>        <mi>I</mi>        <mrow>          <mi>x</mi>          <mi>y</mi>        </mrow>      </msub>      <mo>)</mo>    </mrow>    <mo>=</mo>    <msub>      <mi>I</mi>      <mrow>        <mi>x</mi>        <mi>y</mi>      </mrow>    </msub>    <mo>&CircleTimes;</mo>    <mi>w</mi>    <mo>;</mo>  </mrow> 
      calculating Harris response of each pixel in the foreground of the crowd according to the weighting matrix M;
      and performing non-maximum suppression on each pixel point according to the Harris response, and selecting a local maximum point as a key point of the crowd prospect.
    4. The method for detecting region crowd retention according to claim 1, wherein the tracking the key points by a sparse optical flow method to obtain the motion trajectories of the key points comprises the following sub-steps:
      acquiring a neighborhood of the key point, and generating a constraint equation of an ith pixel Ii in the neighborhood as follows:
      <mrow>    <msub>      <mi>I</mi>      <msub>        <mi>x</mi>        <mi>i</mi>      </msub>    </msub>    <mo>&CenterDot;</mo>    <mi>u</mi>    <mo>+</mo>    <msub>      <mi>I</mi>      <msub>        <mi>y</mi>        <mi>i</mi>      </msub>    </msub>    <mo>&CenterDot;</mo>    <mi>v</mi>    <mo>+</mo>    <msub>      <mi>I</mi>      <msub>        <mi>t</mi>        <mi>i</mi>      </msub>    </msub>    <mo>=</mo>    <mn>0</mn>  </mrow> 
      wherein,u, v represent the x component, y component of the key point instantaneous velocity;
      solving an objective function by least squaresObtaining the key point instantaneous speed U ═ U, v;
      and calculating the motion trail of the key point according to the instantaneous speed.
    5. The method of regional population retention detection according to any one of claims 4, further comprising the steps of:
      if the tracked key points are lost, selecting new key points of the current scene image for supplement;
      specifically, the method comprises the following substeps:
      respectively calculating the sum of the distances between the new key point and the tracked key point;
      and selecting the new key point with the maximum sum of the distances for supplement.
    6. The method of any one of claims 1-5, wherein after updating the retention statistics matrix according to the retention point matrix, the method further comprises the steps of:
      and carrying out crowd retention visualization on the scene image according to the updated retention statistical matrix.
    7. A regional crowd retention detection device, comprising:
      the first acquisition module is used for acquiring a crowd foreground in the scene image;
      the second acquisition module is used for acquiring key points of the crowd foreground;
      the tracking module is used for tracking the key points to obtain the motion trail of the key points;
      the first calculation module is used for calculating the movement distance of the key point according to the movement track;
      the marking module is used for marking the key points as retention key points and generating a retention point matrix if the movement speed of the key points is smaller than a preset threshold;
      and the updating module is used for updating the retention statistical matrix according to the retention point matrix.
    8. The regional crowd retention detection device of claim 7, further comprising:
      the second calculation module is used for calculating a perspective matrix of the scene image;
      and the filtering module is used for carrying out Gaussian filtering on the stagnation point matrix, and the standard deviation of a filtering kernel of the Gaussian filtering is obtained according to the perspective matrix.
    9. The regional population hold-up detection device of claim 7 or 8, further comprising:
      and the supplement module is used for selecting a new key point of the current scene image for supplement if the tracked key point is lost.
    10. A storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the regional crowd retention detection method according to any one of claims 1 to 6.
    Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201710482241.8A CN107481260A (en) | 2017-06-22 | 2017-06-22 | A kind of region crowd is detained detection method, device and storage medium | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201710482241.8A CN107481260A (en) | 2017-06-22 | 2017-06-22 | A kind of region crowd is detained detection method, device and storage medium | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| CN107481260A true CN107481260A (en) | 2017-12-15 | 
Family
ID=60594803
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN201710482241.8A Pending CN107481260A (en) | 2017-06-22 | 2017-06-22 | A kind of region crowd is detained detection method, device and storage medium | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN107481260A (en) | 
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN109858367A (en) * | 2018-12-29 | 2019-06-07 | 华中科技大学 | The vision automated detection method and system that worker passes through support unsafe acts | 
| CN110263619A (en) * | 2019-04-30 | 2019-09-20 | 上海商汤智能科技有限公司 | Image processing method and device, electronic equipment and computer storage medium | 
| CN111027387A (en) * | 2019-11-11 | 2020-04-17 | 北京百度网讯科技有限公司 | Method, device and storage medium for obtaining population evaluation and evaluation model | 
| CN113408857A (en) * | 2021-05-24 | 2021-09-17 | 柳州东风容泰化工股份有限公司 | Management method and system for thioacetic acid leakage emergency treatment | 
| CN114550085A (en) * | 2022-02-17 | 2022-05-27 | 上海商汤智能科技有限公司 | Crowd positioning method and device, electronic equipment and storage medium | 
| CN115909508A (en) * | 2023-01-06 | 2023-04-04 | 浙江大学计算机创新技术研究院 | Image key point enhancement detection method under single-person sports scene | 
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN101325690A (en) * | 2007-06-12 | 2008-12-17 | 上海正电科技发展有限公司 | Method and system for detecting human flow analysis and crowd accumulation process of monitoring video flow | 
| CN102156880A (en) * | 2011-04-11 | 2011-08-17 | 上海交通大学 | Method for detecting abnormal crowd behavior based on improved social force model | 
| CN102799863A (en) * | 2012-07-02 | 2012-11-28 | 中国计量学院 | Method for detecting group crowd abnormal behaviors in video monitoring | 
| CN102867311A (en) * | 2011-07-07 | 2013-01-09 | 株式会社理光 | Target tracking method and target tracking device | 
| CN102968802A (en) * | 2012-11-28 | 2013-03-13 | 无锡港湾网络科技有限公司 | Moving target analyzing and tracking method and system based on video monitoring | 
| CN103679149A (en) * | 2013-12-11 | 2014-03-26 | 哈尔滨工业大学深圳研究生院 | Method and device for detecting crowd gathering expressed in convex hull based on angular points | 
| CN104933710A (en) * | 2015-06-10 | 2015-09-23 | 华南理工大学 | Intelligent analysis method of store people stream track on the basis of surveillance video | 
| CN105447458A (en) * | 2015-11-17 | 2016-03-30 | 深圳市商汤科技有限公司 | Large scale crowd video analysis system and method thereof | 
| CN106023262A (en) * | 2016-06-06 | 2016-10-12 | 深圳市深网视界科技有限公司 | Crowd flowing main direction estimating method and device | 
- 
        2017
        - 2017-06-22 CN CN201710482241.8A patent/CN107481260A/en active Pending
 
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN101325690A (en) * | 2007-06-12 | 2008-12-17 | 上海正电科技发展有限公司 | Method and system for detecting human flow analysis and crowd accumulation process of monitoring video flow | 
| CN102156880A (en) * | 2011-04-11 | 2011-08-17 | 上海交通大学 | Method for detecting abnormal crowd behavior based on improved social force model | 
| CN102867311A (en) * | 2011-07-07 | 2013-01-09 | 株式会社理光 | Target tracking method and target tracking device | 
| CN102799863A (en) * | 2012-07-02 | 2012-11-28 | 中国计量学院 | Method for detecting group crowd abnormal behaviors in video monitoring | 
| CN102968802A (en) * | 2012-11-28 | 2013-03-13 | 无锡港湾网络科技有限公司 | Moving target analyzing and tracking method and system based on video monitoring | 
| CN103679149A (en) * | 2013-12-11 | 2014-03-26 | 哈尔滨工业大学深圳研究生院 | Method and device for detecting crowd gathering expressed in convex hull based on angular points | 
| CN104933710A (en) * | 2015-06-10 | 2015-09-23 | 华南理工大学 | Intelligent analysis method of store people stream track on the basis of surveillance video | 
| CN105447458A (en) * | 2015-11-17 | 2016-03-30 | 深圳市商汤科技有限公司 | Large scale crowd video analysis system and method thereof | 
| CN106023262A (en) * | 2016-06-06 | 2016-10-12 | 深圳市深网视界科技有限公司 | Crowd flowing main direction estimating method and device | 
Non-Patent Citations (4)
| Title | 
|---|
| LIXIN CHEN 等: "Detecting Anomaly Based on Time Dependence for Large Scenes", 《PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION》 * | 
| MD. HAIDAR SHARIF 等: "Crowd Behavior Surveillance Using Bhattacharyya Distance Metric", 《COMPIMAGE》 * | 
| 兰红 等: "动态背景下的稀疏光流目标提取与跟踪", 《中国图象图形学报》 * | 
| 曹志通 等: "改进的基于角点检测的视频人数统计方法", 《计算机应用》 * | 
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN109858367A (en) * | 2018-12-29 | 2019-06-07 | 华中科技大学 | The vision automated detection method and system that worker passes through support unsafe acts | 
| CN110263619A (en) * | 2019-04-30 | 2019-09-20 | 上海商汤智能科技有限公司 | Image processing method and device, electronic equipment and computer storage medium | 
| CN111027387A (en) * | 2019-11-11 | 2020-04-17 | 北京百度网讯科技有限公司 | Method, device and storage medium for obtaining population evaluation and evaluation model | 
| CN111027387B (en) * | 2019-11-11 | 2023-09-26 | 北京百度网讯科技有限公司 | Number of people assessment and assessment model acquisition methods, devices and storage media | 
| CN113408857A (en) * | 2021-05-24 | 2021-09-17 | 柳州东风容泰化工股份有限公司 | Management method and system for thioacetic acid leakage emergency treatment | 
| CN114550085A (en) * | 2022-02-17 | 2022-05-27 | 上海商汤智能科技有限公司 | Crowd positioning method and device, electronic equipment and storage medium | 
| CN114550085B (en) * | 2022-02-17 | 2025-07-18 | 上海商汤智能科技有限公司 | Crowd positioning method and device, electronic equipment and storage medium | 
| CN115909508A (en) * | 2023-01-06 | 2023-04-04 | 浙江大学计算机创新技术研究院 | Image key point enhancement detection method under single-person sports scene | 
| CN115909508B (en) * | 2023-01-06 | 2023-06-02 | 浙江大学计算机创新技术研究院 | Image key point enhancement detection method under single sports scene | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| CN107481260A (en) | A kind of region crowd is detained detection method, device and storage medium | |
| Chan et al. | Counting people with low-level features and Bayesian regression | |
| CN105405154B (en) | Target object tracking based on color-structure feature | |
| Sidla et al. | Pedestrian detection and tracking for counting applications in crowded situations | |
| JP6095018B2 (en) | Detection and tracking of moving objects | |
| CN110276785B (en) | Anti-shielding infrared target tracking method | |
| US8965050B2 (en) | Behavior analysis device | |
| EP2858008B1 (en) | Target detecting method and system | |
| US9361520B2 (en) | Method and system for tracking objects | |
| CN111209781B (en) | Method and device for counting indoor people | |
| CN104200492B (en) | Video object automatic detection tracking of taking photo by plane based on profile constraints | |
| CN107992827A (en) | A kind of method and device of the multiple mobile object tracking based on threedimensional model | |
| Havasi et al. | Detection of gait characteristics for scene registration in video surveillance system | |
| CN111091025B (en) | Image processing method, device and equipment | |
| CN102111530B (en) | Mobile object detection device and method | |
| KR101681104B1 (en) | A multiple object tracking method with partial occlusion handling using salient feature points | |
| CN104809742A (en) | Article safety detection method in complex scene | |
| KR20120007850A (en) | Object identification device and method based on partial template matching | |
| JP7538631B2 (en) | Image processing device, image processing method, and program | |
| Aziz et al. | Pedestrian Head Detection and Tracking Using Skeleton Graph for People Counting in Crowded Environments. | |
| Selinger et al. | Classifying moving objects as rigid or non-rigid without correspondences | |
| Greenhill et al. | Occlusion analysis: Learning and utilising depth maps in object tracking | |
| CN107665495B (en) | Object tracking method and object tracking device | |
| Sincan et al. | Moving object detection by a mounted moving camera | |
| Mohamed et al. | Real-time moving objects tracking for mobile-robots using motion information | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date: 20171215 | |
| RJ01 | Rejection of invention patent application after publication |