The allocation of healthcare resources on ships is crucial for safety and well-being due to limited access to external aid. Proficient medical staff on board provide a mobile healthcare facility, offering a range of services from first aid to complex procedures. This paper presents a system model utilizing Reinforcement Learning (RL) to optimize doctor-patient assignments and resource allocation in maritime settings. The RL approach focuses on dynamic, sequential decision-making, employing Q-learning to adapt to changing conditions and maximize cumulative rewards. Our experimental setup involves a simulated healthcare environment with variable patient conditions and doctor availability, operating within a 24-hour cycle. The Q-learning algorithm iteratively learns optimal strategies to enhance resource utilization and patient outcomes, prioritizing emergency cases while balancing the availability of medical staff. The results highlight the potential of RL in improving healthcare delivery on ships, demonstrating the system's effectiveness in dynamic, time-constrained scenarios and contributing to overall maritime safety and operational resilience.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Every year about 30 million people travel by ships world wide often in extreme weather conditions and also in polluted environment due to ship's fuel combustion and many other factors that impacts the health of both passengers and crew staff so there is a need of medical staff but that's not always available so we introduce an a model based on Reinforcement learning(RL) that is used as the key approach in dialogue system.We incorporates Hierarchical reinforcement learning (HRL) model with the layers of Deep Q-Network for dialogue oriented diagnosis system.policy learning is integrated as policy gradients are already defined.We created two stage hierarchical strategy.We used the hierarchical structure with double layer policies for automatic disease diagnosis.Double layer means it splits the task into sub-tasks named as high-state strategy and low level strategy.It has a component called user simulator that communicates with patient for symptom collection low level agent inquire symptoms.Once its done collecting it sends results to high level agent which activates the D-classifier for last diagnosis.When its done its send back by user simulator to patients to verify diagnosis made.Every single diagnosis made has its own reward that trains the system.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.