Current Issue

27 October 2022, Volume 33 Issue 5
Half space object classification via incident angle based fusion of radar and infrared sensors
Zhenyu HE, Xiaodong ZHUGE, Junxiang WANG, Shihao YU, Yongjun XIE, Yuxiong ZHAO
2022, 33(5):  1025-1031.  doi:10.23919/JSEE.2022.000100
Abstract ( )   HTML ( )   PDF (6478KB) ( )  
Figures and Tables | References | Related Articles | Metrics

In this paper, we introduce an incident angle based fusion method for radar and infrared sensors to improve the recognition rate of complex targets under half space scenarios, e.g., vehicles on the ground in this paper. For radar sensors, convolutional operation is introduced into the autoencoder, a “winner-take-all (WTA)” convolutional autoencoder (CAE) is used to improve the recognition rate of the radar high resolution range pro?le (HRRP). Moreover, different from the free space, the HRRP in half space is more complex. In order to get closer to the real situation, the half space HRRP is simulated as the dataset. The recognition rate has a growth more than 7% compared with the traditional CAE or denoised sparse autoencoder (DSAE). For infrared sensor, a convolutional neural network (CNN) is used for infrared image recognition. Finally, we combine the two results with the Dempster-Shafer (D-S) evidence theory, and the discounting operation is introduced in the fusion to improve the recognition rate. The recognition rate after fusion has a growth more than 7% compared with a single sensor. After the discounting operation, the accuracy rate has been improved by 1.5%, which validates the effectiveness of the proposed method.

Reflection separation technology based on polarization characteristics
Yan ZHANG, Jinghua ZHANG, Zhiguang SHI, Yu ZHANG, Feng LING
2022, 33(5):  1032-1042.  doi:10.23919/JSEE.2022.000101
Abstract ( )   HTML ( )   PDF (4810KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Specific to the reflected light problem on the surface of transparent body, the polarization characteristics of the reflection region are analyzed, and a polarization characterization model combining the reflection and transmission effects is established. On the basis of the polarization characteristic analysis, the minimum value of normalized cross-correlation (NCC) coefficient between transmission and reflection images is solved through the gradient descent method, and their polarization degrees under the minimum correlation are acquired. According to the distribution relations of the transmitted and reflected lights in perpendicular and parallel directions, reflection separation is realized via the polarized orthogonality difference algorithm based on the degree of reflection polarization and the degree of transmission polarization.

Label correlation for partial label learning
Lingchi GE, Min FANG, Haikun LI, Bo CHEN
2022, 33(5):  1043-1051.  doi:10.23919/JSEE.2022.000102
Abstract ( )   HTML ( )   PDF (5315KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Partial label learning aims to learn a multi-class classifier, where each training example corresponds to a set of candidate labels among which only one is correct. Most studies in the label space have only focused on the difference between candidate labels and non-candidate labels. So far, however, there has been little discussion about the label correlation in the partial label learning. This paper begins with a research on the label correlation, followed by the establishment of a unified framework that integrates the label correlation, the adaptive graph, and the semantic difference maximization criterion. This work generates fresh insight into the acquisition of the learning information from the label space. Specifically, the label correlation is calculated from the candidate label set and is utilized to obtain the similarity of each pair of instances in the label space. After that, the labeling confidence for each instance is updated by the smoothness assumption that two instances should be similar outputs in the label space if they are close in the feature space. At last, an effective optimization program is utilized to solve the unified framework. Extensive experiments on artificial and real-world data sets indicate the superiority of our proposed method to state-of-art partial label learning methods.

DOA estimation based on multi-frequency joint sparse Bayesian learning for passive radar
Jinfang WEN, Jianxin YI, Xianrong WAN, Ziping GONG, Ji SHEN
2022, 33(5):  1052-1063.  doi:10.23919/JSEE.2022.000103
Abstract ( )   HTML ( )   PDF (3600KB) ( )  
Figures and Tables | References | Related Articles | Metrics

This paper considers multi-frequency passive radar and develops a multi-frequency joint direction of arrival (DOA) estimation algorithm to improve estimation accuracy and resolution. The developed algorithm exploits the sparsity of targets in the spatial domain. Specifically, we first extract the required frequency channel data and acquire the snapshot data through a series of preprocessing such as clutter suppression, coherent integration, beamforming, and constant false alarm rate (CFAR) detection. Then, based on the framework of sparse Bayesian learning, the target’s DOA is estimated by jointly extracting the multi-frequency data via evidence maximization. Simulation results show that the developed algorithm has better estimation accuracy and resolution than other existing multi-frequency DOA estimation algorithms, especially under the scenarios of low signal-to-noise ratio (SNR) and small snapshots. Furthermore, the effectiveness is verified by the field experimental data of a multi-frequency FM-based passive radar.

Multiple transformation analysis for interference separation in TDCS
Guisheng WANG, Yequn WANG, Shufu DONG, Guoce HUANG
2022, 33(5):  1064-1078.  doi:10.23919/JSEE.2022.000104
Abstract ( )   HTML ( )   PDF (5498KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Various types of interference signals limit the practical application of transform domain communication systems (TDCSs) in the severe electromagnetic field, an orthogonal basis learning method of transformation analysis (OBL-TA) is proposed to effectively address the problem of obtaining an optimal transform domain based on sparse representation. Then, the sparse availability is utilized to obtain the optimal transformation analysis by the iterative methods, which yields the sparse representation for transform domain (SRTD) in unrestricted form. In addition, the iterative version of SRTD (I-SRTD) in unrestricted form is obtained by decomposing the SRTD problem into three sub-problems and each sub-problem is iteratively solved by learning the best orthogonal basis. Furthermore, orthogonal basis learning via cost function minimization process is conducted by stochastic descent, which is assured to converge to a local minimum at least. Finally, the optimal transformation analysis is developed by the effectiveness of different transform domains according to the accuracy of the sparse representation and an optimal transformation analysis separately (OPTAS) is applied to the synthesized signal forms with conic alternatives, dualization, and smoothing. Simulation results demonstrate that the superiorities of the proposed methods achieve the optimal recovery and separation more rapidly and accurately than conventional methods.

Gaussian process regression-based quaternion unscented Kalman robust filter for integrated SINS/GNSS
Xu LYU, Baiqing HU, Yongbin DAI, Mingfang SUN, Yi LIU, Duanyang GAO
2022, 33(5):  1079-1088.  doi:10.23919/JSEE.2022.000105
Abstract ( )   HTML ( )   PDF (5227KB) ( )  
Figures and Tables | References | Related Articles | Metrics

High-precision filtering estimation is one of the key techniques for strapdown inertial navigation system/global navigation satellite system (SINS/GNSS) integrated navigation system, and its estimation plays an important role in the performance evaluation of the navigation system. Traditional filter estimation methods usually assume that the measurement noise conforms to the Gaussian distribution, without considering the influence of the pollution introduced by the GNSS signal, which is susceptible to external interference. To address this problem, a high-precision filter estimation method using Gaussian process regression (GPR) is proposed to enhance the prediction and estimation capability of the unscented quaternion estimator (USQUE) to improve the navigation accuracy. Based on the advantage of the GPR machine learning function, the estimation performance of the sliding window for model training is measured. This method estimates the output of the observation information source through the measurement window and realizes the robust measurement update of the filter. The combination of GPR and the USQUE algorithm establishes a robust mechanism framework, which enhances the robustness and stability of traditional methods. The results of the trajectory simulation experiment and SINS/GNSS car-mounted tests indicate that the strategy has strong robustness and high estimation accuracy, which demonstrates the effectiveness of the proposed method.

Modified OMP method for multi-target parameter estimation in frequency-agile distributed MIMO radar
Wenge XING, Chuanrui ZHOU, Chunlei WANG
2022, 33(5):  1089-1094.  doi:10.23919/JSEE.2022.000106
Abstract ( )   HTML ( )   PDF (3710KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Introducing frequency agility into a distributed multiple-input multiple-output (MIMO) radar can significantly enhance its anti-jamming ability. However, it would cause the sidelobe pedestal problem in multi-target parameter estimation. Sparse recovery is an effective way to address this problem, but it cannot be directly utilized for multi-target parameter estimation in frequency-agile distributed MIMO radars due to spatial diversity. In this paper, we propose an algorithm for multi-target parameter estimation according to the signal model of frequency-agile distributed MIMO radars, by modifying the orthogonal matching pursuit (OMP) algorithm. The effectiveness of the proposed method is then verified by simulation results.

A parallel pipeline connected-component labeling method for on-orbit space target monitoring
Zongling LI, Qingjun ZHANG, Teng LONG, Baojun ZHAO
2022, 33(5):  1095-1107.  doi:10.23919/JSEE.2022.000107
Abstract ( )   HTML ( )   PDF (1983KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The paper designs a peripheral maximum gray difference (PMGD) image segmentation method, a connected-component labeling (CCL) algorithm based on dynamic run length (DRL), and a real-time implementation streaming processor for DRL-CCL. And it verifies the function and performance in space target monitoring scene by the carrying experiment of Tianzhou-3 cargo spacecraft (TZ-3). The PMGD image segmentation method can segment the image into highly discrete and simple point targets quickly, which reduces the generation of equivalences greatly and improves the real-time performance for DRL-CCL. Through parallel pipeline design, the storage of the streaming processor is optimized by 55% with no need for external memory, the logic is optimized by 60%, and the energy efficiency ratio is 12 times than that of the graphics processing unit, 62 times than that of the digital signal proccessing, and 147 times than that of personal computers. Analyzing the results of 8756 images completed on-orbit, the speed is up to 5.88 FPS and the target detection rate is 100%. Our algorithm and implementation method meet the requirements of lightweight, high real-time, strong robustness, full-time, and stable operation in space irradiation environment.

A situation awareness assessment method based on fuzzy cognitive maps
Jun CHEN, Xudong GAO, Jia RONG, Xiaoguang GAO
2022, 33(5):  1108-1122.  doi:10.23919/JSEE.2022.000108
Abstract ( )   HTML ( )   PDF (5685KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The status of an operator’s situation awareness is one of the critical factors that influence the quality of the missions. Thus the measurement method of the situation awareness status is an important topic to research. So far, there are lots of methods designed for the measurement of situation awareness status, but there is no model that can measure it accurately in real-time, so this work is conducted to deal with such a gap. Firstly, collect the relevant physiological data of operators while they are performing a specific mission, simultaneously, measure their status of situation awareness by using the situation awareness global assessment technique (SAGAT), which is known for accuracy but cannot be used in real-time. And then, after the preprocessing of the raw data, use the physiological data as features, the SAGAT’s results as a label to train a fuzzy cognitive map (FCM), which is an explainable and powerful intelligent model. Also, a hybrid learning algorithm of particle swarm optimization (PSO) and gradient descent is proposed for the FCM training. The final results show that the learned FCM can assess the status of situation awareness accurately in real-time, and the proposed hybrid learning algorithm has better efficiency and accuracy.

Review on artificial intelligence techniques for improving representative air traffic management capability
Jun TANG, Gang LIU, Qingtao PAN
2022, 33(5):  1123-1134.  doi:10.23919/JSEE.2022.000109
Abstract ( )   HTML ( )   PDF (3350KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The use of artificial intelligence (AI) has increased since the middle of the 20th century, as evidenced by its applications to a wide range of engineering and science problems. Air traffic management (ATM) is becoming increasingly automated and autonomous, making it lucrative for AI applications. This paper presents a systematic review of studies that employ AI techniques for improving ATM capability. A brief account of the history, structure, and advantages of these methods is provided, followed by the description of their applications to several representative ATM tasks, such as air traffic services (ATS), airspace management (AM), air traffic flow management (ATFM), and flight operations (FO). The major contribution of the current review is the professional survey of the AI application to ATM alongside with the description of their specific advantages: (i) these methods provide alternative approaches to conventional physical modeling techniques, (ii) these methods do not require knowing relevant internal system parameters, (iii) these methods are computationally more efficient, and (iv) these methods offer compact solutions to multivariable problems. In addition, this review offers a fresh outlook on future research. One is providing a clear rationale for the model type and structure selection for a given ATM mission. Another is to understand what makes a specific architecture or algorithm effective for a given ATM mission. These are among the most important issues that will continue to attract the attention of the AI research community and ATM work teams in the future.

Target threat estimation based on discrete dynamic Bayesian networks with small samples
Fang YE, Ying MAO, Yibing LI, Xinrui LIU
2022, 33(5):  1135-1142.  doi:10.23919/JSEE.2022.000076
Abstract ( )   HTML ( )   PDF (5151KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The accuracy of target threat estimation has a great impact on command decision-making. The Bayesian network, as an effective way to deal with the problem of uncertainty, can be used to track the change of the target threat level. Unfortunately, the traditional discrete dynamic Bayesian network (DDBN) has the problems of poor parameter learning and poor reasoning accuracy in a small sample environment with partial prior information missing. Considering the finiteness and discreteness of DDBN parameters, a fuzzy k-nearest neighbor (KNN) algorithm based on correlation of feature quantities (CF-FKNN) is proposed for DDBN parameter learning. Firstly, the correlation between feature quantities is calculated, and then the KNN algorithm with fuzzy weight is introduced to fill the missing data. On this basis, a reasonable DDBN structure is constructed by using expert experience to complete DDBN parameter learning and reasoning. Simulation results show that the CF-FKNN algorithm can accurately fill in the data when the samples are seriously missing, and improve the effect of DDBN parameter learning in the case of serious sample missing. With the proposed method, the final target threat assessment results are reasonable, which meets the needs of engineering applications.

Scenario-oriented hybrid particle swarm optimization algorithm for robust economic dispatch of power system with wind power
Bing WANG, Pengfei ZHANG, Yufeng HE, Xiaozhi WANG, Xianxia ZHANG
2022, 33(5):  1143-1150.  doi:10.23919/JSEE.2022.000110
Abstract ( )   HTML ( )   PDF (5543KB) ( )  
Figures and Tables | References | Related Articles | Metrics

An economic dispatch problem for power system with wind power is discussed. Using discrete scenario to describe uncertain wind powers, a threshold is given to identify bad scenario set. The bad-scenario-set robust economic dispatch model is established to minimize the total penalties on bad scenarios. A specialized hybrid particle swarm optimization (PSO) algorithm is developed through hybridizing simulated annealing (SA) operators. The SA operators are performed according to a scenario-oriented adaptive search rule in a neighborhood which is constructed based on the unit commitment constraints. Finally, an experiment is conducted. The computational results show that the developed algorithm outperforms the existing algorithms.

UAV safe route planning based on PSO-BAS algorithm
Honghong ZHANG, Xusheng GAN, Shuangfeng LI, Zhiyuan CHEN
2022, 33(5):  1151-1160.  doi:10.23919/JSEE.2022.000111
Abstract ( )   HTML ( )   PDF (1895KB) ( )  
Figures and Tables | References | Related Articles | Metrics

In order to solve the current situation that unmanned aerial vehicles (UAVs) ignore safety indicators and cannot guarantee safe operation when operating in low-altitude airspace, a UAV route planning method that considers regional risk assessment is proposed. Firstly, the low-altitude airspace is discretized based on rasterization, and then the UAV operating characteristics and environmental characteristics are combined to quantify the risk value in the low-altitude airspace to obtain a 3D risk map. The path risk value is taken as the cost, the particle swarm optimization-beetle antennae search (PSO-BAS) algorithm is used to plan the spatial 3D route, and it effectively reduces the generated path redundancy. Finally, cubic B-spline curve is used to smooth the planned discrete path. A flyable path with continuous curvature and pitch angle is generated. The simulation results show that the generated path can exchange for a path with a lower risk value at a lower path cost. At the same time, the path redundancy is low, and the curvature and pitch angle continuously change. It is a flyable path that meets the UAV performance constraints.

Operational effectiveness evaluation based on the reduced conjunctive belief rule base
Ziwei ZHANG, Qisheng GUO, Zhiming DONG, Hongxiang LIU, Ang GAO, Pengcheng QI
2022, 33(5):  1161-1172.  doi:10.23919/JSEE.2022.000112
Abstract ( )   HTML ( )   PDF (7050KB) ( )  
Figures and Tables | References | Related Articles | Metrics

To address the issue of rule premise combination explosion in the construction of the traditional complete conjunctive belief rule base (BRB), this paper introduces an orthogonal design method to reduce the conjunctive BRB. The reasoning method based on reduced conjunctive BRB is designed with the help of the conversion technology from conjunctive BRB to disjunctive BRB. Finally, the operational mission effectiveness evaluation is taken as an example to verify the proposed method. The results show that the method proposed in this paper is feasible and effective.

Hierarchical reinforcement learning guidance with threat avoidance
Bohao LI, Yunjie WU, Guofei LI
2022, 33(5):  1173-1185.  doi:10.23919/JSEE.2022.000113
Abstract ( )   HTML ( )   PDF (5355KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The guidance strategy is an extremely critical factor in determining the striking effect of the missile operation. A novel guidance law is presented by exploiting the deep reinforcement learning (DRL) with the hierarchical deep deterministic policy gradient (DDPG) algorithm. The reward functions are constructed to minimize the line-of-sight (LOS) angle rate and avoid the threat caused by the opposed obstacles. To attenuate the chattering of the acceleration, a hierarchical reinforcement learning structure and an improved reward function with action penalty are put forward. The simulation results validate that the missile under the proposed method can hit the target successfully and keep away from the threatened areas effectively.

Hybrid Q-learning for data-based optimal control of non-linear switching system
Xiaofeng LI, Lu DONG, Changyin SUN
2022, 33(5):  1186-1194.  doi:10.23919/JSEE.2022.000114
Abstract ( )   HTML ( )   PDF (3938KB) ( )  
Figures and Tables | References | Related Articles | Metrics

In this paper, the optimal control of non-linear switching system is investigated without knowing the system dynamics. First, the Hamilton-Jacobi-Bellman (HJB) equation is derived with the consideration of hybrid action space. Then, a novel data-based hybrid Q-learning (HQL) algorithm is proposed to find the optimal solution in an iterative manner. In addition, the theoretical analysis is provided to illustrate the convergence and optimality of the proposed algorithm. Finally, the algorithm is implemented with the actor-critic (AC) structure, and two linear-in-parameter neural networks are utilized to approximate the functions. Simulation results validate the effectiveness of the data-driven method.

Maneuvering target state estimation based on separate modeling of target trajectory shape and dynamic characteristics
Zhuanhua ZHANG, Gongjian ZHOU
2022, 33(5):  1195-1209.  doi:10.23919/JSEE.2022.000115
Abstract ( )   HTML ( )   PDF (1875KB) ( )  
Figures and Tables | References | Related Articles | Metrics

The state estimation of a maneuvering target, of which the trajectory shape is independent on dynamic characteristics, is studied. The conventional motion models in Cartesian coordinates imply that the trajectory of a target is completely determined by its dynamic characteristics. However, this is not true in the applications of road-target, sea-route-target or flight route-target tracking, where target trajectory shape is uncoupled with target velocity properties. In this paper, a new estimation algorithm based on separate modeling of target trajectory shape and dynamic characteristics is proposed. The trajectory of a target over a sliding window is described by a linear function of the arc length. To determine the unknown target trajectory, an augmented system is derived by denoting the unknown coefficients of the function as states in mileage coordinates. At every estimation cycle except the first one, the interaction (mixing) stage of the proposed algorithm starts from the latest estimated base state and a recalculated parameter vector, which is determined by the least squares (LS). Numerical experiments are conducted to assess the performance of the proposed algorithm. Simulation results show that the proposed algorithm can achieve better performance than the conventional coupled model-based algorithms in the presence of target maneuvers.

Impact angle constrained fuzzy adaptive fault tolerant IGC method for Ski-to-Turn missiles with unsteady aerodynamics and multiple disturbances
Hang GUO, Zheng WANG, Bin FU, Kang CHEN, Wenxing FU, Jie YAN
2022, 33(5):  1210-1226.  doi:10.23919/JSEE.2022.000116
Abstract ( )   HTML ( )   PDF (7899KB) ( )  
Figures and Tables | References | Related Articles | Metrics

An impact angle constrained fuzzy adaptive fault tolerant integrated guidance and control method for Ski-to-Turn (STT) missiles subject to unsteady aerodynamics and multiple disturbances is proposed. Unsteady aerodynamics appears when flight vehicles are in a transonic state or confronted with unstable airflow. Meanwhile, actuator failures and multisource model uncertainties are introduced. However, the boundaries of these multisource uncertainties are assumed unknown. The target is assumed to execute high maneuver movement which is unknown to the missile. Furthermore, impact angle constraint puts forward higher requirements for the interception accuracy of the integrated guidance and control (IGC) method. The impact angle constraint and the precise interception are established as the object of the IGC method. Then, the boundaries of the lumped disturbances are estimated, and several fuzzy logic systems are introduced to compensate the unknown nonlinearities and uncertainties. Next, a series of adaptive laws are developed so that the undesirable effects arising from unsteady aerodynamics, actuator failures and unknown uncertainties could be suppressed. Consequently, an impact angle constrained fuzzy adaptive fault tolerant IGC method with three loops is constructed and a perfect hit-to-kill interception with specified impact angle can be implemented. Eventually, the numerical simulations are conducted to verify the effectiveness and superiority of the proposed method.

Design and simulation of the ATP system considering the advanced targeting angle in quantum positioning system
Shuang CONG, Xiang ZHANG, Shiqi DUAN
2022, 33(5):  1227-1236.  doi:10.23919/JSEE.2022.000117
Abstract ( )   HTML ( )   PDF (2165KB) ( )  
Figures and Tables | References | Related Articles | Metrics

A compensation implementation scheme of the advanced targeting process based on the fine tracking system is proposed in this paper. Based on the working process of the quantum positioning system (QPS) and its acquisition, tracking and pointing (ATP) system, the advanced targeting subsystem of the ATP system is designed. Based on six orbital parameters of the quantum satellite Mozi, the advanced targeting azimuth angle and pitch angle are transformed into the dynamic tracking center of the fine tracking system in the ATP system. The deviation of the advanced targeting process is analyzed. In the Simulink, the simulation experiment of the ATP system considering the deviation compensation of the advanced targeting is carried out, and the results are analyzed.

Improved adaptively robust estimation algorithm for GNSS spoofer considering continuous observation error
Yangjun GAO, Guangyun LI, Zhiwei LYU, Lundong ZHANG, Zhongpan LI
2022, 33(5):  1237-1248.  doi:10.23919/JSEE.2022.000118
Abstract ( )   HTML ( )   PDF (9526KB) ( )  
Figures and Tables | References | Related Articles | Metrics

Once the spoofer has controlled the navigation system of unmanned aerial vehicle (UAV), it is hard to effectively control the error convergence to meet the threshold condition only by adjusting parameters of estimation if estimation of the spoofer on UAV has continuous observation error. Aiming at this problem, the influence of the spoofer’s state estimation error on spoofing effect and error convergence conditions is theoretically analyzed, and an improved adaptively robust estimation algorithm suitable for steady-state linear quadratic estimator is proposed. It enables the spoofer’s estimator to reliably estimate UAV status in real time, improves the robustness of the estimator in responding to observation errors, and accelerates the convergence time of error control. Simulation experiments show that the mean value of normalized innovation squared (NIS) is reduced by 88.5%, and the convergence time of NIS value is reduced by 76.3%, the convergence time of true trajectory error of UAV is reduced by 42.3%, the convergence time of estimated trajectory error of UAV is reduced by 67.4%, the convergence time of estimated trajectory error of the spoofer is reduced by 33.7%, and the convergence time of broadcast trajectory error of the spoofer is reduced by 54.8% when the improved algorithm is used. The improved algorithm can make UAV deviate from preset trajectory to spoofing trajectory more effectively and more subtly.

Research on virtual entity decision model for LVC tactical confrontation of army units
Ang GAO, Qisheng GUO, Zhiming DONG, Zaijiang TANG, Ziwei ZHANG, Qiqi FENG
2022, 33(5):  1249-1267.  doi:10.23919/JSEE.2022.000119
Abstract ( )   HTML ( )   PDF (13211KB) ( )  
Figures and Tables | References | Related Articles | Metrics

According to the requirements of the live-virtual-constructive (LVC) tactical confrontation (TC) on the virtual entity (VE) decision model of graded combat capability, diversified actions, real-time decision-making, and generalization for the enemy, the confrontation process is modeled as a zero-sum stochastic game (ZSG). By introducing the theory of dynamic relative power potential field, the problem of reward sparsity in the model can be solved. By reward shaping, the problem of credit assignment between agents can be solved. Based on the idea of meta-learning, an extensible multi-agent deep reinforcement learning (EMADRL) framework and solving method is proposed to improve the effectiveness and efficiency of model solving. Experiments show that the model meets the requirements well and the algorithm learning efficiency is high.

Joint optimization of inspection-based and age-based preventive maintenance and spare ordering policies for single-unit systems
Weining MA, Fei ZHAO, Xin LI, Qiwei HU, Bingcong SHANG
2022, 33(5):  1268-1280.  doi:10.23919/JSEE.2022.000120
Abstract ( )   HTML ( )   PDF (2621KB) ( )  
Figures and Tables | References | Related Articles | Metrics

This paper presents a joint optimization policy of preventive maintenance (PM) and spare ordering for single-unit systems, which deteriorate subject to the delay-time concept with three deterioration stages. PM activities that combine a non-periodic inspection scheme with age-replacement are implemented. When the system is detected to be in the minor defective stage by an inspection for the first time, place an order and shorten the inspection interval. If the system has deteriorated to a severe defective stage, it is either repaired imperfectly or replaced by a new spare. However, an immediate replacement is required once the system fails, the maximal number of imperfect maintenance (IPM) is satisfied or its age reaches to a pre-specified threshold. In consideration of the spare ’s availability as needed, there are three types of decisions, i.e., an immediate or a delayed replacement by a regular ordered spare, an immediate replacement by an expedited ordered spare with a relative higher cost. Then, some mutually independent and exclusive renewal events at the end of a renewal cycle are discussed, and the optimization model of such a joint policy is further developed by minimizing the long-run expected cost rate to find the optimal inspection and age-replacement intervals, and the maximum number of IPM. A Monte-Carlo based integration method is also designed to solve the proposed model. Finally, a numerical example is given to illustrate the proposed joint optimization policy and the performance of the Monte-Carlo based integration method.