Main Aspects of Robotics

June 6, 2025 by Abdur Rosyid

The main aspects of robotics can be summarized below:

1. Topology

There are so many existing and possible-in-the-future body types and topologies of a robot. The existing ones include: 1) wheeled unmanned ground vehicles (UGVs), 2) track-chained UGVs, 3) biped, quadruped, and six-legged robots, 4) humanoids, 5) multi-rotor unmanned aerial vehicles (UAVs), 6) flapping aerial robots, 7) surface water vehicles, 8) underwater vehicles, 9) fish-like underwater robots, 10) snake-like robots, 11) soft robots, such as an octopus robot, 12) serial manipulators, 13) tree-like manipulators, such as a robotic hand consisting of several fingers, 14) parallel manipulators, 15) hybrid serial-parallel manipulators, 16) cable-driven robots, and 17) reconfigurable/transformable robots. But this is not the final list as other body types and topologies can be designed and made in the future.

For the wheeled robots, the wheels can be either standard wheels, omni-directional wheels, or Mecano (Swedish) wheels. The number of wheels also varies, typically either three, four, or six. For the multi-rotor UAVs, the number of rotors is typically either four (quadrotor) or six (hex-rotor). For the manipulators, while the topologies of serial and tree-like manipulators are rather limited, the topologies of parallel and hybrid serial-parallel robots are so many and only a few of them have been deployed into market.

The determination of robot body type and topology is mainly driven by the requirements, such as function (what the robot is used for), mobility requirement, where the robot will be used (even land terrain, uneven land terrain, surface water, underwater, air, etc), speed, space that can be occupied, etc.

2. Kinematics

Once a specific robot body’s topology is given, the kinematics of the robot should be formulated. There are two level of kinematics to be derived: 1) pose kinematics and 2) differential kinematics. The pose kinematics include: 1) inverse kinematics and 2) forward kinematics. From the differential kinematics, the Jacobian of a robot is obtained. A commonly used kinematics formulation method, particularly for serial manipulator, is based on Denavit-Hartenberg (DH) definition of the robot. The pose of a mobile robot or a manipulator’s end-effector in the task space, in general, consists of Cartesian position (x, y, z) and orientation. There are several orientation representations which can be used. The most popular ones are Euler angles representation, Angle-Axis representation (less commonly used), and Quaternion representation. An Euler angles representation should be defined with a certain sequence of rotation. Since the Euler angles representation can easily face a formulation singularity, the Quaternion representation is quite frequently used alternatively.

For a robot with rigid links, the kinematics assumes that the links do not undergo elastic deformation. On the other hand, for a robot with flexible and soft links, the kinematics takes the elastic deformation of the robot links into account. In the case of a robot with flexible links, the elastic deformation can be either small or large. In the case of a robot with soft links, i.e. a soft robot, the elastic deformation is typically large.

3. Dynamics

Similar to kinematics, dynamics can be divided into: 1) rigid body dynamics, 2) dynamics of robot with small elastic displacements, and 3) dynamics of robot with large elastic displacements. Which type of dynamics should be used depends on the behavior of the robot body. Dynamics problem is commonly classified into: 1) forward dynamics and 2) inverse dynamics. The forward dynamics problem is defined as: “Given the actuator’s force/torque, find the motion (pose, velocity, and acceleration) of the robot”. The inverse dynamics problem is defined as: “Given the motion (pose, velocity, and acceleration) of the robot, find the required actuator’s force/torque”. From this definition, it can be easily understood that the forward dynamics is typically used to simulate a robot, whereas the inverse dynamics is used for control purpose, such as to size the actuators and to develop a model-based control scheme. Since the dynamics equations are second-order differential equations, the forward dynamics is typically solved by transforming the second-order differential equations to the first-order one (commonly called the state-space representation) and subsequently numerically integrating the first-order system to obtain the position and velocity of the robot. Afterwards, the acceleration can be obtained easily. The inverse dynamics is straightforward; it is simply arranging the equations of motion such that the actuator’s force/torque vector is on the left-hand side.

For the dynamics of robot with flexible bodies, there are also some other types of dynamics analysis such as modal analysis, time response analysis, and frequency response analysis. Time response analysis is actually a forward dynamics analysis, but in the case of a robot with flexible bodies, the flexibility of the robot bodies is taken into account. The flexibility in this case may come from the links, the joints, and/or the actuators of the robot.

The dynamics equations, commonly known as equations of motion, can be derived by using various methods, such as Euler-Newton method, Lagrangian method, Virtual Work method, Kane method, Gibbs-Appel method, etc. In general, the equations of motion can be solved, either for forward dynamics or inverse dynamics, by using: 1) non-recursive method or 2) recursive method.

4. Detail design and fabrication

After the topology and kinematics of a robot are given, unless it is an off-the-shelf robot, one should make a detailed design of the robot and subsequently fabricate the robot. The detailed design should consider several aspects including the strength of robot bodies/links, selection of materials, selection of off-the-shelf components, selection of actuators, optimization of the control effort, and optimization of the dynamic performance of the robot. A trade-off should be made between contradictory criteria such as strength versus control effort and dynamic performance. In this case, the robot bodies/links should be made as light as possible but strong enough to withstand the expected payload. This typically need an optimal selection of materials and topology optimization of the robot bodies/links. CAD and CAE software is typically used to perform this task.

5. Low-level motion control

The low-level motion control is real-time control of the actuators based on closed-loop control using internal sensors such as encoder or resolvers. The simplest is using PID controller. The hardware of the low-level motion control typically consists of micro-controller or motion controller, motor drives, motors, and encoders/resolvers. To ensure a real-time performance, the PID control software is typically deployed to the micro-controller or motion controller. To improve the performance of the motion control, particularly in demanding systems such as those with varying load, non-linearity, coupling between axes, and/or high-dynamics effect, some advanced control schemes can be used, such as Linear Quadratic Regulator (LQR) control, Linear Quadratic Gaussian (LQG) control, Fuzzy Inference System (FIS) control, Adaptive Neuro-Fuzzy Inference System (ANFIS) control, adaptive control, robust control, computed torque feedforward control, inverse dynamics feedback linearization control, passivity-based control, Model Predictive Control (MPC), and Sliding Mode Control (SMC).

Since the PID control is linear, the nonlinearity of the robot dynamics is typically treated as disturbance if the PID control is used. The simplest implementation of PID control is decentralized (independent joint) control scheme. In this scheme, each axis (actuator) is controlled independently. Another possible, simple implementation is master-and-slave scheme, in which one axis serves as the master whereas the other axes serve as the slaves. The LQR and LQG are also linear control methods since the system dynamics is presented as a linear system. A robot with nonlinear dynamics can be control using these schemes by linearizing the robot’s dynamics about an equilibrium point. Fuzzy controls are typically used to make a gradual (continuous) regulation based on a certain membership function (triangular, trapezoidal, etc). In an adaptive control scheme, the PID gains are typically regulated through a certain adaptation technique in order to adapt with the changing robot’s parameters and/or loads. A robust control scheme aims at suppressing the effect of disturbance so that the performance of control is not negatively affected by the disturbance. The computed torque feedforward and the inverse dynamics feedback linearization control are both model-based control schemes which requires the inverse dynamics of the robot in order to compensate the non-linearity of the robot’s dynamics so that the PID control can only handle the linear part of the robot’s dynamics. Since model-based control typically requires intensive computation of the inverse dynamics, an appropriate hardware is required to perform the computation at low latency. The MPC and SMC are other control methods which are used to deal with nonlinear dynamics of a robot.

Two types of control commands are typically used in robotics: 1) position command and 2) velocity command. The position command specifies a position set-point to be achieved. The velocity command asks the robot to move at the commanded velocity; zero speed means no motion, positive speed means motion to the positive direction, and negative speed means motion toward the negative direction. The magnitude of a non-zero commanded speed indicates the speed set-point. The velocity command is typically used to jog a robot (manipulator or mobile robot).

Finally, the motion control can be implemented either: 1) in the joint space or 2) in the task space. The former means the measurement (sensing) is performed in the joint space, whereas the latter means the measurement is performed in the task space. Although the latter is ideal, its implementation is usually impossible or impractical. For this reason, the motion control in the joint space is more common.

6. Low-level force control

The existing force control schemes can be broadly classified into two categories: 1) passive interaction (compliance) control and 2) active interaction (compliance) control. The passive interaction control only utilizes the compliance of a certain part in the robot’s body in order to control the robot’s interaction with its environment. In this case, either compliant joint(s) or compliant link(s) can be used. A common practice in industrial robots is by using a compliant end-effector to perform a peg-in-hole task. Due to the compliance of the end-effector, the end-effector can adapt to the environment to accomplish the peg-in-hole task without changing the commanded trajectory of the robot’s end-effector at the execution time. The passive interaction force does not need the use of F/T sensor and hence it is cheap. It also provides fast response. However, it does not guarantee that the contact force (between the robot and the environment) is never unexpectedly large (because the contact force is not explicitly controlled. In addition, a specific compliant part that provides the compliance in the interaction may need to be designed and made for a specific task.

The active interaction control typically needs the use of F/T sensor to feed the wrench measurement to the controller; and hence it is expensive. It also provides slower response due to the cycle time of the control loop (feedback). In order to enable quite fast response, an active interaction control should be combined with some degree of passive compliance.

The active interaction control can be divided into two classes, namely: 1) indirect force control and 2) direct force control. The indirect force control is called so because it does not directly control the force through a force feedback closure. Instead, it controls the force through motion control. There are two schemes of the indirect force control: 1) impedance control and 2) admittance control. Both the impedance and admittance control actually does not always need wrench measurement. However, the control becomes nonlinear and coupled if wrench measurement is not conducted. With wrench measurement, the impedance and admittance control will be linear and uncoupled. In the impedance control, the control system reacts to the motion deviation by generating control forces. In the admittance control, the control system reacts to the interaction forces by imposing a motion deviation.

The direct force control, as implied by the name, directly control the interaction forces through a force feedback closure. An explicit model of the interaction between the robot and the environment is required. In this case, one needs to specify the desired motion and the desired wrench in a consistent way with respect to the constraints imposed by the environments. A common type of direct force control is the hybrid motion-control force which aims at controlling the motion in the unconstrained directions and controlling the wrench in the contrained directions.

7. Perception and high-level control

In order to perceive the environment and accordingly act based on the robot’s perception on the environment, the robot needs one or more exteroceptive sensors and high-level control based on information acquired by the exteroceptive sensors. Various exteroceptive sensors can be used to sense the temperature, distance of objects, appearance of objects, etc. The temperature of the surrounding objects can be measured by using a thermal camera. The distance of surrounding objects can be measured by using an ultrasonic sensor or a LiDAR. An ultrasonic sensor simply gives the distance of the objects based on the reflection of the ultrasonic wave emitted by the sensor. A LiDAR, either a 2D LiDAR or 3D LiDAR, provides a pointcloud of the objects. The appearance of surrounding objects can be captured by using a camera which gives RGB images of the objects. A certain type of cameras such as Kinect and RGB-D cameras are able to capture both the appearance and distance of the surrounding objects. In this case, the appearance of the surrounding objects is given as RGB images whereas the depth information acquired by the cameras is given as depth images which can then be converted to pointcloud.

The raw data acquired from the environment, either pointcloud or RGB images, is typically processed before it is used by an algorithm to get a certain meaningful information from the data. A pointcloud can be manipulated by using some tools such as PointCloud Library (PCL) whereas RGB images can be processed and enhanced through some image processing techniques. Detection of objects can be performed by applying some detection algorithms to pointcloud or images. These detection algorithms can be broadly classified into two categories: 1) classic detection algorithms and 2) algorithms based on deep neural network (commonly called the deep learning algorithms). The processing of pointcloud typically requires high CPU requirement whereas the processing of images typically requires the use of GPU. Once the objects of interest are detected, the robot needs to make an action such as avoiding detected obstacles, approaching a detected object, picking/grabbing a detected object, etc. The process of making actions based on data acquired from the environment by using some exteroceptive sensors followed by running a detection algorithm to infer some meaningful information from the data is called high-level control. It is called so because such control is a higher level of control built on top of the low-level control of the robot’s actuators. The high-level control is typically performed at lower frequency due to the demanding process, whereas the low-level control is typically performed at higher frequency.

Vision-based control, commonly called the visual-servoing, can be classified into three categories: 1) image-based (2D) visual servoing, 2) position-based (3D) visual-servoing, and 3) hybrid (2.5D) visual-servoing. The setup of camera(s) in a vision-based control can be either: 1) eye-in-hand or 2) eye-to-hand. In the eye-in-hand setup, the camera is moving with the robot/manipulator. In the eye-to-hand setup, the camera is fixed at an inertial frame in the environment, looking at the motion of the robot and moving object(s) in the environment.

Controlling a robot by detecting the surrounding objects leads to the autonomous or semi-autonomous operation of robots. Since the detection of objects by using pointcloud requires an expensive LiDAR and high compute requirement, the development of detection algorithms based on RGB images, particulary using deep learning algorithms, recently gains high interest as the RGB cameras are cheap.

8. Calibration and system identification

The actual kinematic parameters of a robot are often different from the design values of the kinematic parameters. This is typically due to some errors in the fabrication, assembly, clearance, etc. The actual kinematic parameters are typically estimated by using some methods, including linear (small) perturbation of the kinematic parameters and linear/nonlinear least squares. Once the estimates of the actual kinematic parameters are obtained, they are imposed to the kinematic model of the robot to compensate the kinematic errors. This is called kinematic calibration, or commonly called the calibration for short. The calibration techniques can be classified into two broad categories: 1) internal calibration and 2) external calibration. The internal calibration uses internal sensor to make the measurements. In this case, some calibration artifacts are commonly used. The external calibration uses an external sensor to make the measurements. The calibration can also be divided as: 1) offline calibration and 2) online calibration. The offline calibration is performed when the robot is not in operation, whereas the online calibration is performed while the robot is in operation.

In a robot with vision-based control, it is common to calibrate the camera. This may include two types of calibration: 1) the calibration of intrinsic parameters of the camera and 2) the calibration of extrinsic parameters. A very common way to calibrate a camera is by using a chessboard (board with checker pattern) having known dimensions.

While the actual kinematic parameters are identified (estimated) in the calibration, the actual dynamic parameters are identified in the so-called system identification. Hence, the system identification usually means the identification of the dynamic parameters. These dynamic parameters may include the inertial parameters (the masses, first moments of inertia, second moments of inertia), the stiffness parameters, and the damping parameters of the robot’s components. The knowledge on the actual dynamic parameters are crucial in model-based control since the performance of this type of control depends much on the accuracy of the dynamic model. Several algorithms can be employed to identify the dynamic parameters. They are broadly classified into two approaches: 1) model-driven identification and 2) data driven identification. The model-driven identification includes the use of linear/nonlinear least squares or Kalman Filter / Extended Kalman Filter (EKF) / Unscented Kalman Filter (UKF). The data-driven identification uses abundant data acquired from the robot to be used in learning-based algorithms such as deep learning algorithms.

9. Localization

To able to control well a mobile robot such as UGV, UAV, legged robot, humanoid robot, surface water robot, or underwater robot, it is critical to localize accurately the robot. The localization of a robot means determining the location (pose) of a robot with respect to a known frame. A robot can be localized with respect to its odometry frame, the map frame, or a certain world frame. The mapping between the map frame and the world frame is typically known. The mapping between the odometry frame and the map frame is also usually known.

Localization is basically the estimation of the robot’s pose. It is basically to answer the question: “Where is the robot now?”. Among the famous algorithm to solve this pose estimation problem is Monte Carlo method which was enhanced to Adaptive Monte Carlo method, EKF, and Particle Filter (PF).

The most basic method of localization is using the odometry data of the robot. The localization using wheel odometry is based on dead reckoning. However, due to slips and drifts of the robot, the localization based on wheel odometry data may result in inaccurate estimates of the robot’s pose. For this reason, it is very common to fuse the odometry data with other data to obtain more accurate estimates of the robot’s pose. The most common sensor data fusion for robot localization is fusing odometry with Inertial Measurement Unit (IMU). In an outdoor setting, Global Positioning System (GPS) data can also be added into the fusion. The odometry can be wheel odometry in the case of wheeled robots or visual odometry in the case of UAVs. GPS can only be used in an outdoor environment. The GPS data can be problematic in an environment with tall buildings/structures nearby due to interference. The standard GPS typically has a quite large error (a few meters of error). Differential GPS (DGPS) provides less positioning error. Real-time Kinematic Positioning (RTK) system also provides less error.

10. SLAM

For a mobile robot, both the map and the robot localization are critical. In some circumstances, a map of the environment in which the robot is operating is already available. If the environment is known in advance, a map can be easily generated in a manual manner by using a graphical/drawing tool. The map is a binary map, containing only black and white spaces. Obstacles such as walls and furniture are indicated by black color whereas empty spaces are indicated by white color.

If the environment is unknown in advance, either because it has never been visited before or the environment keeps changing, then simultaneous localization and mapping (SLAM) is required to be performed. A SLAM algorithm aims at generating a map while it explores the environment and localizing the robot at the same time. This is actually a chicken-and-egg problem. For this reason, this problem is solved by using an estimation algorithm, such as EKF, UKF, and PF. The most famous SLAM method using PF is FastSLAM.

The SLAM is typically solved based on closure of the robot motion in the environment, i.e. the robot moves in the environment and return to its initial pose. This closure is critical to ensure the accuracy of SLAM since the estimated pose of the robot is by nature less and less accurate (i.e. more and more uncertain) when the robot moves farther and farther from its initial position. When a robot closes the loop, the uncertainties collapse.

SLAM can be either 2D or 3D. The 2D SLAM utilizes 2D sensor such as 2D LiDAR, whereas the 3D SLAM requires 3D sensors such as 3D LiDAR or camera. When utilizing a camera, the SLAM is commonly called the visual SLAM. The visual SLAM uses the images acquired by an onboard camera. There are various SLAM packages having been developed. Some of commonly used SLAM packages are Gmappping, Cartographer, and RTABMap.

11. Motion planning and navigation

For a manipulator, motion planning can be classified into two problems: 1) motion planning without obstacle avoidance, 2) motion planning with obstacle avoidance. For a mobile robot such as UGV, biped robot, quadruped robot, UAV, and underwater robot, the motion planning problem typically always considers obstacle avoidance. Navigation is defined as moving a mobile robot from its initial pose to its goal pose through an optimal path without colliding with obstacles in the environment.

The motion planning of a manipulator without obstacle avoidance can be performed either in the joint space or in the task space. There are two problem cases: 1) interpolation from point to point and 2) interpolation through multiple points. The latter case of interpolation is required to plan the motion of the actuators to make motion in the task space that follows multiple segments without a need to stop at the transition point between two adjacent segments. Depending on the need, this can be achieved by going through the multiple transition points or by going around the multiple transition points. The goal of interpolation is to obtain the trajectory which consists of path and its corresponding timing. The interpolation is performed by using a certain motion profile, such as linear motion profile with quadratic blends (commonly known as trapezoidal motion profile), S motion profile, linear motion profile with quintic blends, etc. The interpolation in the joint space is simpler but it may result in an unexpected path in the task space, unless there is a unity mapping between the actuators’ velocity and the end-effector’s velocity. It also does not guarantee an obstacle avoidance in the task space. For these reasons, it is more practical to perform interpolation in the task space based on the expected trajectory of the end-effector and subsequently map the interpolated trajectory in the task space to that in the joint space for actuation purpose. The motion planning in the task space can easily avoid obstacles in the task space.

The motion planning of a manipulator or a mobile robot with obstacle avoidance are typically divided into: 1) global motion planner, 2) local motion planner. The global motion planning is performed based on a certain motion planning algorithm. A famous classical motion planning method is based on the potential field. This can be seen as building a topological map of the environment based on the potential field, by which obstacles are represented by protrusions whereas unoccupied area is represented by surface without protrusions. The goal is represented by a point in the map with the lowest height which represents the lowest potential field. The motion planning is performed by generating a path from the higher potential field to the lower potential field, and finally to the lowest potential field (i.e. the goal).

Another common approach of motion planning is by using occupancy map which can be 2D or 3D. A 3D occupancy map, consisting of cubic voxels, is commonly called the octomap. The size of the grids/voxels indicates the resolution of the map which in turn affects the accuracy of the motion planning. The occupancy map is at least binary, which consists of occupied and unoccupied grids/voxels. This map can be made manually of the environment is known in advance or automatically generated by using a sensor such as LiDAR or RGB-D camera. A motion planning algorithm aims at determining the path from the initial pose of the manipulator or mobile robot to a goal pose without colliding with obstacles (occupied grids/voxels). An occupancy map can be transformed to a more sophisticated map called the costmap. The grids/voxels in the costmap have the values from 0 to 255, where 0 indicates unoccupied (free) area whereas 255 indicates occupied area. Instead of being binary, the costmap introduces gray area in which the collision is probable. Some existing motion planning algorithms based on the occupancy map include the depth-first algorithm, the breadth-first algorithm, the heuristic algorithm, the Djikstra algorithm, and the A* algorithm. There are also algorithms based on Reinforcement Learning (RL) methods such as the Dynamic Programming algorithm and some more advanced RL algorithms for motion planning.

The local motion planner is indicated by the local costmap around the robot. This local costmap, moving with the robot, is the basis of the motion planning and obstacle avoidance in the limited area around the robot. The Dynamic Windows Approach (DWA) is a famous local motion planner. This algorithm is based on a velocity (translational and angular velocities) search space. The admissible velocities are defined as translational and angular velocities which are safe for the robot to stop before colliding with obstacles. The locality is imposed by the use of dynamic window which restricts the admissible velocity to those which can be reached within a short time interval given the limited acceleration of the robot. The locally planned path is determined based on optimization of the heading, clearance, and forward velocity of the robot.

The global motion planner is typically computationally expensive and therefore can be slow. It is not practical to be used alone if the environment is changing with time, because the global planner needs to be repeatedly run to accommodate the change of the environment. The local motion planner is typically computationally less expensive and hence it is faster. But it only plans locally.

Finally, motion planning can be classified into two problems: 1) navigation with capability to avoid static environment, 2) navigation with capability to avoid dynamic environment. In the former problem, the environment, including the obstacles, is not changing with time. In the latter problem, the environment, including the obstacles, is changing with time. A typical example of the latter problem is motion planning of an autonomous (self-driving) car in a traffic with other vehicles moving. A common way to avoid dynamic/sudden obstacles is by applying global and local planners simultaneously.

12. Mission Planner and Finite State Machine

In performing a certain mission, a robot typically involves a set of tasks which typically cannot be arranged as a simple set of sequential tasks. In most cases, the tasks involved in a mission depends on certain conditions. For example, task B should be executed if task A has been accomplished. If the robot fails to accomplish task A, it should execute task C instead of task B. And so on. If the scenario is quite simple, an if-then-else algorithm can be used. For a more complex scenario, it is more convenient to develop a finite state machine (FSM) the robot should follow.

13. Communication

There are several types of communication involved in robotics: 1) hardware-to-hardware communication within a robot, 2) node-to-node communication, and 3) robot-to-robot communication. The hardware-to-hardware communication can be communication: between onboard PC to embedded real-time controller, between embedded real-time controller and motor drives, between embedded real-time controller and internal sensors, between onboard PC and exteroceptive sensors, between onboard PC and remote controller, etc. The node-to-node communication is communication between one software node to another software node. The robot-to-robot communication is required if there are multiple robots that should be in coordination with each other.

Leave a Reply Cancel reply