Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Autonoma drönare: Modifiering av belöningsfunktion i AirSim
2018 (Swedish)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Drones are growing popular and so is the research within the field of autonomous drones. There are several research problems around autonomous vehicles overall, but one interesting problem covered by this study is the autonomous manoeuvring of drones. One interesting path for autonomous drones is through deep reinforcement learning, which is a combination of deep neural networks and reinforcement learning. Problems that researchers often encounter within the field stretch from time consuming training, effective manoeuvring to problems with unpredictability and security. Even high costs of testing can be an issue. With the help of simulation programs, we are able to test algorithms without any concerns to cost or other real-world factors that could limit our work. Microsoft’s own simulator AirSim lets users control the vehicle in their simulator through an application programming interface, which enables the possibility to test a variety of algorithms. The research question addressed in this study is how can the pre-existing reward function be improved on avoiding obstacles and move the drone from start to goal. The goal of this study is to find improvements on AirSim’s pre-existing Deep Q-Network algorithm’s reward function and test it in two different simulated environments. By conducting several experiments and storing evaluation metrics produced by the agents, it was possible to observe a result. The observed evaluation metrics included the average reward that the agent received over time, number of collisions and overall performance in the respective environment. We were not successfully able to gather enough data to measure an improvement of the evaluation metrics for the modified reward function. The modified function that was created performed well but did not display any substantially improved performance. To be able to successfully compare if one reward function is better than the other more research needs to be done. With the difficulties of gathering data, the conclusion is that we created a reward function that we can’t tell if it is better or worse than

Place, publisher, year, edition, pages
2018.
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hb:diva-25529OAI: oai:DiVA.org:hb-25529DiVA, id: diva2:1566546
Subject / course
Informatics
Available from: 2021-06-24 Created: 2021-06-15 Last updated: 2021-06-24Bibliographically approved

Open Access in DiVA

KSAI05(1170 kB)55 downloads
File information
File name FULLTEXT01.pdfFile size 1170 kBChecksum SHA-512
07787e6afb3038576cd093fc241f4641c3b5da367deb15782b1e36175bc7c44c79f7570c08e5d9181478a07b46baab78382aee708a6290a272ef80b964526972
Type fulltextMimetype application/pdf

Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 55 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 80 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf