Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Autonoma drönare: Modifiering av belöningsfunktion i AirSim
2018 (svensk)Independent thesis Basic level (degree of Bachelor), 10 poäng / 15 hpOppgave
Abstract [en]

Drones are growing popular and so is the research within the field of autonomous drones. There are several research problems around autonomous vehicles overall, but one interesting problem covered by this study is the autonomous manoeuvring of drones. One interesting path for autonomous drones is through deep reinforcement learning, which is a combination of deep neural networks and reinforcement learning. Problems that researchers often encounter within the field stretch from time consuming training, effective manoeuvring to problems with unpredictability and security. Even high costs of testing can be an issue. With the help of simulation programs, we are able to test algorithms without any concerns to cost or other real-world factors that could limit our work. Microsoft’s own simulator AirSim lets users control the vehicle in their simulator through an application programming interface, which enables the possibility to test a variety of algorithms. The research question addressed in this study is how can the pre-existing reward function be improved on avoiding obstacles and move the drone from start to goal. The goal of this study is to find improvements on AirSim’s pre-existing Deep Q-Network algorithm’s reward function and test it in two different simulated environments. By conducting several experiments and storing evaluation metrics produced by the agents, it was possible to observe a result. The observed evaluation metrics included the average reward that the agent received over time, number of collisions and overall performance in the respective environment. We were not successfully able to gather enough data to measure an improvement of the evaluation metrics for the modified reward function. The modified function that was created performed well but did not display any substantially improved performance. To be able to successfully compare if one reward function is better than the other more research needs to be done. With the difficulties of gathering data, the conclusion is that we created a reward function that we can’t tell if it is better or worse than

sted, utgiver, år, opplag, sider
2018.
HSV kategori
Identifikatorer
URN: urn:nbn:se:hb:diva-25529OAI: oai:DiVA.org:hb-25529DiVA, id: diva2:1566546
Fag / kurs
Informatics
Tilgjengelig fra: 2021-06-24 Laget: 2021-06-15 Sist oppdatert: 2021-06-24bibliografisk kontrollert

Open Access i DiVA

KSAI05(1170 kB)55 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 1170 kBChecksum SHA-512
07787e6afb3038576cd093fc241f4641c3b5da367deb15782b1e36175bc7c44c79f7570c08e5d9181478a07b46baab78382aee708a6290a272ef80b964526972
Type fulltextMimetype application/pdf

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 55 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 80 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf