IA, Python and Armory 3d/Blender

Didier · February 7, 2019, 4:39pm

Yes, it could simplify a lot of works.

The role of the Neural Network is to replace the need to set specific algorithms belonging to those kinds of specific behaviors that you list. All you need is to let your NN plays, learns from it’s loose or win.

For exemple, considering the game tanks, an action gives it a reward or not and the NN learns what actions give best reward only using the picture taken by only one camera in the scene.
The image could be other sensors, or for example in the case of robot, I actually use angles/quaternions in the robot arm.

What is interesting too, is to discover the capacity that our NN has to imagine new scenarios that noone envisages before.
What I observed too during training with tanks is that the NN learns quicker if you help it to win … for example by moving the other tank/target at a good place for getting a hit. It doesn’t matter for it if you moved or not the target. So I think it’s an example of what we will discover with NN inside games in the futur, and creating a good training could reveal a good experience of the game situations. How to win could become how to better train my NN.

MagicLord · February 7, 2019, 6:23pm

Specific behaviour is made to put rules on the game AI , so the player knows how to play and how to get advantage of AI flaws.

There was some people making a military simulation, it was so realistic , most of the time the player lost and the game became frustrating so a bad game.
It’s easy to make some game with AI always finding player and never miss, that’s game people don’t play lol

But i think NN can be used in a way the difficulty can be tweaked.
Well, i hope you’ll make a good demo game AI we could play and see if the AI is good enough.
This could be interesting, for example Metal Gear Solid game AI.

Didier · February 8, 2019, 8:34am

@MagicLord
I also share your opinion that a good game isn’t made by a super IA, too performant. I think too that it’s important in those examples of shooting games with “behaviour trees” and “goal oriented action planning” to keep a good feeling of combat.

I propose the use NN applied to another type of gameplays, more in the spirit of Age of Empire, at the limit like WorldOfWarships, or for games where you try to accomplish complex mission task.

My proposal is elsewhere. Take the image of a chess game or football game, but where you don’t move your pieces with your mouse but let the pieces of the chessboard/field equiped with NN to decide itselfs where to move. The interesting part of the game would be how to obtain the best training of the NN. Your are in the position of the manager/teacher of your team.

It could be too for example a game where your engineer increasingly more complex ROVs by training their NN for increasingly more complex mission tasks, inspired by what we find here with existing underwater robots competitions (https://www.marinetech.org/rov-competition-2/) or by escape rooms.

Didier · February 11, 2019, 9:54am

Info about robot arm kinematics progress : Thereafter a short gif giving an idea of the robot during it’s training (**see explaination at bottom).

Random data of joints angles are sent as an action to the robot arm at the beginning of the training, then as the training progresses, values from the NN replace those random values. Thus during the training, we see the robot arm more and more often finding the right position of the arm in order to reach the given target.

insta

** (In a Forward kinematics problem, the location of the final end cap of the robot arm (Effector) in space, i.e. the position and orientation of the entire arm, is determined according to the joints angles. If you consider a position and orientation of the Effector that reach a Goal, the Inverse kinematics problem is to find the values of the joints that allow the arm to reach this given location. There are various ways to determine the solution of the Inverse kinematics problem and in Armory you have the Forward and IK Node at your disposal, but I invite you to test and verify that it’s not perfect. The NNs are capable of representing those non-linear relationships between input and output data. It’s already demonstrated with Tanks Armory/Atrap, and here the NN is learning through a try/reward mechanism that provides the link between cartesian and joints space required by the Inverse kinematics problem.

zicklag · February 11, 2019, 4:22pm

Always glad to see your progress. Keep up the good work @Didier!

Didier · May 10, 2019, 5:53pm

The news : realization of a 3D data visualization of the robotic arm training.

The neural network is represented to follow its evolution during the training of the industrial robot.
Each point is a neuron and the color (inspired by the colors of infrared cameras) represents the synaptic coefficients.
The next evolution of this data visualization will integrate the evolution of the Loss during the robot training.

The network is built automaticaly according to the Atrap parameters and using spwan objects, unique color to each one… then a challenge to make it without wasting too much of the PC “I5 CPU/windows10” ressources.

The first tests have been launched and will help to better understand how a neural network works, which is still a vast area of research today.

I have a feeling too that results obtained in maths for nuclear physics to represent certain things will be useful to me, as well as some of the processing used for radar systems.

Didier · May 10, 2019, 6:41pm

see for example the evolution of the neural network at batch 134 now …

There are simple and fun question to test and see thanks to this neural network data visualization, like for example if I move the target to help or not the robot, will the training be better and faster, or not? a little similar for us, to know if we learn better with good successes or failures, rather than with average things…

Didier · May 23, 2019, 1:17pm

After this first version of the NN and lastly the visualization of the evolution of the NN during its training, it’s time for me to move on to a whole new approach, with the main lines of which are as follows.

Constat and context:
If I refer to the choices I had to make when beginning with the SW design in Armory for a relatively big application and requiring speed, I would say that I was brought to adopt an “object composition” approach in OOP because of the existence of the Node trees editor, either graph structures and the sequences it allows for example (cf. In/Out of nodes, Events, Set property,…).

I compare this result with the one I would have had with a programming only in Haxe, that would have led me to a “class inheritance” approach, much more OOP.

However, currently, the core is built as a matrix of matrix of matrix with maths applied on it, and performance at execution is excellent.

Another constat is that the spaggethi design inside lot of Node trees I made, makes things difficult to maintain and reuse. I have improved things by establishing quality rules such as naming, for example, to help me find my way around. Thus I think that creating for exemple a game, and especially if the development team is large, by in first providing a major upstream effort in terms of quality rules, according to this approach would also lead to a better experience.

Another constat is a lacke in this first version of ATRAP, that is the possibility to dynamically cut/set links between neurones during execution of the training, according to certain strategies. I think this could become fundamental for new generations of Neural Networks and architectures.

Then because of the Events and Set Property I used a lot in last version of ATRAP, the next OOP design evolution I am considering is based on a “Subject / Observer” approach.

Neurones and Layers as Subject/Observer :

In this case for exemple inside the Neural Network, all observer neurons, as registered observers of other neurons, are notified and upgraded automatically when the subject neuron changes state. The same applies to Layers like the Input, x Hiddens, Output Layers.

(that is to compare to a standard approach that could consists in a Neural Network instance that iterates through every connection between Neurons in every Layer into the Neural Network)

If there is one improvement I intend to achieve, it’s in the evolution capacities in dynamics, cad suppression/addition of Observer or Subject to discover new behavior in NN and thus improve as never before, the dynamic transformation of the architecture of the NN in training according to the <State, Action, Reward> and the evolution of the Loss and Variance.

This may lead to exploring innovative other approaches for new generations of NNs with more software/computer-based and less mathematical methods of managing things like backpropagation within the NN. I think that this concept could lead to the possibility of creating a meta-intelligence within the NN.

An important node of ATRAP is the “NN Factory”. It would then become an intelligent and dynamic “NN Architect Factory”.

The discussion here Communication between objects about OOP with @zicklag reflects how things are in progress too, like with his last Call Function Logic nodes and his idea of the possibility to create a new “Class” logic node and a special workflow for creating “Classes” to allow a more easy OOP approach in Armory.

Objectives of the next version, like in a radar tracking
I hope that this new version of ATRAP will allow to explore new generations of NNs and according new dynamically reconfigurable architectures.

I also think that this could open new doors, like with the similarities that exists between the physics of matter, signal processing and Neural Networks, which could be then be more easily observed with this type of improvement in ATRAP.

For example, one of the paths I plan to explore is the example of what we find in signal processing with the arrival of FFT. (The most classic use of the Cooley-Tukey algorithm, for example, is a division of the transformation into two parts of identical size n/ 2 at each step.) We can very well imagine that the dynamic transformation of the NN is of this type, with partitioning, increasing or reducing the mass of neurons/layers during learning, that is at each step “t” as in this schema

The dynamic is combined with a kind of Kalman filter (this filter estimates the states of a dynamic system from a series of incomplete or noisy measurements. This is typically the case with NN/Layers/neurons).

This would involve thus a new concept in NN training that I share here for the first time, which is to use a “dynamic of a Layer”, which defines a form of evolution over time of the Layer (like with a target in a radar system).
Actual NN update each neurone during training, but not the Layers architecture. This is where I think to use a kalman filter to obtain the better NN architecture from step t to step t+1, by comparaison of the estimated loss/gradient descent and after the update of the NN. Thus it could enable to eliminate the effect of noise/bad input data as well as bad gradient descent during the training of the NN (cf. gradient-based learning methods and backpropagation). As learning and retrieval are not two independent operations, the Layer structure can then be calculated for the moment, in the past or on a future horizon, just like what is done in radar tracking software. With this prediction, output estimate of a state could be done according to a dynamic Layers ajustement.

To give you a representative image: it’s as if you have a window that you narrow when your target is well detected at a position and you enlarge if you have lost it, thus to find it and hang it up again. A thing that can be very practical in some games to reduce calculations.

With NN, this is a little bit the case with an NN that clings well with a low loss, versus it starts to deviate from the good predictions. The underlying idea is then that shrinking / enlarging the size of Layers can be comparable to changing the size of the window in radar tracking. Thus the ideal is to use as small as possible Layers in order to limit CPU usage during the training and then, but less important, during interrogation…

2019-06-11T22:00:00Z Some News: Using the proposed OOP approach in Armory seems very promising for the readability and simplification of large applications. ( see here for first steps using the proposed OOP approach Communication between objects )

Secondly the NN is now designed to be dynamic and I think to be able to test very interesting things using this new capability that I didn’t already found elsewhere in scientific parutions … a little bit as if one were to move from a frozen architecture NN to an NN capable of metamorphosing according to events.

Didier · November 3, 2020, 11:58am

As you can see in this long post, the observing and efficiently describing behavior of a 3D robot, that could be then generalized to every actor moving into a 3d space, is made here with Neural Network, and more precisely with the well know now DRL (Deep Reinforcement Learning) technics.

Thus like the Armorpaint is a great stand-alone software designed for physically-based texture painting, I proposed a year ago in another post of this forum to start a reflection with in mind if it should be great too for the Armory community to have a kind of “Armoranimation” software designed for allowing us to estimate the pose of a 3D actor, thus efficiently enabling us to quantify the behavior of autonomous pre-trained actors for a game development for example.

As a first approach, considering the first results obtained with the ATRAP benchmark, the DRL could be at the heart of this tool and allow us to train actors (like within ATRAP with the training of the industrial robot in a 3D environment) thanks to the Armory Engine environment.

Thus, we could envisage that using and offering some specialized logic nodes inside an “Armoranimation” tool, we could facilitate:

the capture 3D images/poses, with a very good accuracy as already observed within Atrap test,
the automatic labelling of those 3D captures
the training of NN (Neural Network), like with DRL technics with specialized logic nodes
the export of the trained NN to meshed objects inside Armory

Efficiently designing 3d actors behaviours is a core tenant of modern 3D games, thus the main goal is here to obtain a new tool “Armoranimation” that offer a great acceleration and facilitation in the process of 3D animation and behaviours of actors, like needed for making games, but also for other domains like engineering and robotics.

Feel free to share here your point of view on this potential tool and how you would see it.

*This link Gym (openai.com) to visit too …

then gym/environments.md at master · openai/gym · GitHub to get a better idea of what I mean with the “Armoranimation” tool … equiped with a zoo of trained actors https://github.com/araffin/rl-baselines-zoo*

A remark too when dealing with real world, we should have the possibility to collect large quantity of data sensors, actuators, thus in order to be used for improving the simulation environments in this kind of tool. (In CAO, it’s called hardware in the loop ).
In a same approach, some noises could be too available and used with inputs for each state in order to blur running environments and get an improved simulated environment compared to a real world. )

Didier · May 7, 2021, 9:31am

News May 2021: the actual ATRAP tool spec is summarized with the thereafter lines.

Keep in mind that it’s with “Armory with A.I. inside” used like for creation in 3D game, simulation, robot training, education, … :

Scene Composition : capability to create, select and set 3D objects in place in a 3D Armory scene
Scenarios : capability to create/select behaviours and animations of actors considering adhoc localisation and goals parameters, thanks to AI automatic select/connect of logic nodes and associated RL parameter files creation.

Thus actually, I’m evaluating main frameworks and dev tools to benchmark and test DRL results in training and accuracy of test, but without 3D in a first step.

It’s done with in mind to address 2 main first two main potential problems that I think we could encounter during dev and that come from the experience acquired with ATRAP (think of a tool necessary to be sure to be able to make this)

First, we will have an unknown environment, so we will need a tool to facilitate/learn how this environment works.

Second, each Agent placed in a 3D scene must be able to improve its policy if we want to get something realistic at term.

In ATRAP, the Reinforcement Learning technic implemented inside the logic nodes functions like a trial-and-error learning and the Robot Agent was placed inside a clear and simplified State and Environment space. For example, what’s happen with those simple questions :

if the Q-function that captures the expected total future reward is initially unknown, as well for the Q-function that tell the best action to take.
if the action space is very large and not discreet
if we have a Markov process
if the environment space is infinite
if Agent actions are taken in quasi-real-time

Thus the “AI inside” during creation steps should be able to organize/improve during training the Deep Reinforcement Learning algorithms (that is in short to be able to refine the choice in a library of available logic nodes/parameters that focus on improving the Value learning and Policy Learning inside each Agent)

Next step is to test all of this with raytracing like on a GPU RTX to obtain as realistic as possible results.

It’s the reason why that I think that one of the first tool component could be inspired by the Armory Paint tool, as a useful tool used to train/initialize Agent algorithms in simplified scenes (thus let’s the AI to make a first choice for logic nodes and parameters like for example the discount value) before to throw them into the game.

And it could be a way to test/validate ideas for Agent and logic nodes, implemented inside a tool with a simplifed 3D scene as a first step.