Introduction

The prevalence of gastrointestinal diseases, manifesting as many broad and complex symptoms such as diarrhea, gastrointestinal bleeding, malabsorption, malnutrition, and even neurologic dysfunction, is one major health challenge and a socioeconomic burden for present-day society1. These diseases are characterized by remarkable regional, age, and gender differences2,3. Alarmingly, among these diseases, gastrointestinal cancers account for approximately one-third of worldwide cancer morbidity and mortality4,5. Early screening for gastrointestinal diseases to provide in-time intervention is of vital importance in reducing mortality and improving life expectancy.

The conventional screening method for gastrointestinal disease is endoscopy6,where clinicians use the elongated flexible endoscope with an embedded camera to inspect the gastrointestinal tracts for diagnosis of mucosal diseases. Such minimally invasive method is already used extensively in hospitals, since the inspection is achieved via a natural orifice, no incision is required and the patient can often be discharged the same day after the procedure. Although the endoscopic procedure is aligned with the clinical practice, the vision-based inspection may often be challenging, given the limitations of the optical sensors, poor lighting conditions, and extreme confined working environments in gastrointestinal tract. The darkness, glare, reflection, bleeding from biopsy, blur, and defocus can all impair the imaging quality, even lead to miss diagnosis7. Wireless capsule endoscopy has attracted much interest due to its unintrusive image capturing ability. The miniaturized camera capsule can be swallowed by the patient, and which will then pass through the digestive tract naturally, capture the images along the tract, and transmit the images wirelessly to a clinician’s console for diagnosis. This promising method has been widely adopted as a screening tool due to its ease of use, non-intrusive nature and relatively low cost. However, conventional endoscopy is still required to perform detailed diagnosis after capsule endoscopy, since the capsule is also greatly hindered by its low-quality miniaturized camera, lighting conditions, and the complex and highly dynamic luminal environment in the gastrointestinal tract. Some emerging endoscopy technologies, like Chromoendoscopy (CE)8, can enhance the histopathological diagnosis of tissues, whereas, the need in using a biocompatible dye agent hinders their widespread use. Computerized virtual chromoendoscopy (CVC), as alternative image enhancement technology of the CE, uses multiple optical channels to record the spatial and multispectral information of the target simultaneously, such as the flexible spectral imaging color enhancement system (FICE, Fujinon, Tokyo, Japan). However, some studies indicate that FICE does not improve detection rates of colonic polyps when compared with either convention white light endoscope and CE9,10,11. Meanwhile, such technologies have the same limitations as per the other vision-based methods.

All above mentioned methods are used for inspection providing diagnostic information to the clinicians, and the clinical decisions are made by the clinicians. The inter-endoscopist variability has a considerable effect on the outcome12, which leads to the need of objective screening rather than the current subjective observation by experienced clinicians. Moreover, in current clinical practice, the tactile sensory modality is one of the modalities which has not been applied yet. This absence, on one hand, has limited the ability of clinicians in identifying small polyps and any tissue structural and stiffness abnormalities due to invasive adenocarcinoma or other diseases, which often cannot be identified via endoscopic cameras. On the other hand, although many optical-based methods are already widely used in surgeries, none of these methods can provide the sense of the touch for performing minimally invasive surgery or other fine operations13,14 and getting efficient and veritable force feedback.

Recent research in the sensory mechanism of rat whiskers has shown the potential of such mechanism in various applications to complement the limitation of vision-based methods and provide tactile sensory feedback15,16,17. Because of poor eyesight, rats can only rely on their whiskers to obtain tactile information of their surrounding environment enabling them to navigate in complete darkness and confined areas. This tactile sensing process is analog to human fingertips, by which we interact with nearby objects to capture the textural and structural information about the objects. We rely on such tactile sensing to complement our vision system allowing us to sense the hardness, shape and roughness of objects, and which form the feedback for us to control, manipulate or interact with nearby objects. In fact, clinicians often rely on their tactile sensing ability to perform diagnosis on patients; for instance, palpation is commonly used in physical examination. As a form of novel tactile sensory modality, recent research has shown that artificial whiskers can perform remarkably in texture discrimination and distance perception tasks18,19, which sheds light on their potential application in the clinical gastrointestinal diseases screening, since the surface texture and topology changings are often highly correlated to specific pathological changes, such as polyps, bacterial infection, ulcerative colitis, tumors, etc.

Inspired by the sensory mechanism of the rat whisker, in this paper, a biomimetic artificial whisker-based hardware system with artificial intelligence-enabled self-learning capability is proposed aiming for gastrointestinal screening applications. Its texture recognition and distance perception capabilities based on this novel tactile modality could complement to vision-based method, and enable the whisker to sense polyps, mucosal disease and other abnormalities on the gastrointestinal wall, even in situations where the optical device cannot work well, for instance, in a completely dark environment or un-illuminated areas in the lumen. In the traditional vison-based endoscopy, where the image is captured by the embedded camera often on the tip of the endoscope, and images are then transmitted back to the computer for processing and displaying on the screen for clinical diagnosis. In the proposed method, the raw data acquired by the whisker-based hardware system can be directly fed into an end-to-end deep learning network without the need for extracting handcrafted features. This allows for automated analysis and identification of potential abnormalities, such as diseases related stiffness changes of soft lumen tissues. By adopting this learning network, the burden of data interpretation on clinicians and the inter-endoscopist variability due to differences in diagnostic experience can be reduced, enabling standardized objective assessments. Compared to conventional machine learning methods20,21, this highly efficient end-to-end data analysis method eliminates the time-consuming and laborious feature engineering process required in traditional signal processing methods, thereby reducing the complexity of data processing.

Benchmark tests were conducted to assess the electrical characteristic and core functions of the whisker-based system, which included texture discrimination, distance perception, and stiffness characterization. Besides, to explore the clinical application potentials, in a pilot study, an ex-vivo sample was used to assess the ability of the proposed method in distinguishing some common tissue structures, partly related to pathological changes, which are normal tissue, ulcerative colitis, and ulcerative cancer. After the iterative training of the end-to-end model, the test accuracy can reach 100% for some trials, and 5-fold cross validation demonstrates 94.44% averaged accuracy among the datasets, showing the potential of the proposed method as a new sensing modality for disease screening. In addition, the optimization results of the algorithm have shown that even the deep learning network with relatively small number of layers can still achieve a high detection accuracy, which means relatively low computational power would be required and these results provide a strong theoretical basis for edge computing to perform real time diagnosis. In other words, this low computational resource required characteristic indicates that the proposed can be implemented as a submodule of a surgical robot system without introducing considerable complexity in system integration to enable tactile feedback for surgical instruments and supporting the development of autonomic robotic surgeries. This emerging tactile perception-based screening method could be a promising complement to the conventional vison-based endoscopy and could also form a low-cost alternative early screening mechanism for people living in low and middle-income countries.

Results

Conceptual design

Figure 1 shows the conceptual design of the proposed method. In a gastrointestinal examination, the proposed whisker-based hardware system can be inserted into the intestinal tract via a nature orifice, and then identify tissue pathological abnormalities due to specific diseases along the lumen. Similar to rat whiskers, the proposed will allow the endoscopic system to sense the surrounding tissues and to navigate through the elongated lumen of the gastrointestinal tract. When the whisker contacts the surrounding tissues in the lumen, different response signals, which correspond to pathological information, can be captured by the whisker. Based on the signal responses, together with the proposed algorithms, the structure and contour of the abnormal tissues could be diagnosed and extracted separately for assisting clinical decisions.

Fig. 1: The conceptual design of the proposed artificial whisker-based method inspired by rat whiskers offers a promising approach for GI disease screening.
figure 1

A, B Illustrate the working mechanism of rat whiskers when they travel through a lightless pipe-like environment. C A simulated intestinal tract with pathological changes. D, E Demonstrate the proof-of-concept design of the proposed artificial whisker system, and the clinical screening routine with the proposed system. Here, we focus on a single channel subsystem, which could also assist clinical diagnosis.

To achieve this, a high-performance whisker-based hardware system and analysis algorithms are required for precise sensing and high-level interpretation to support clinical decisions. The whisker-based system is designed for encoding pathological information into electrical signals, and these raw electrical signals will then be converted into high-fidelity digitalized signals through multi-stages amplification, filtering, and analog-to-digital conversion processes. Then, these digitalized signals can be transmitted to a personal computer (PC) or workstation via a wireless or tethered communication link for further interpretation. Utilizing fine-tuned analysis algorithms for translating the raw signals into clinical decisions, this novel method has great potential in providing an end-to-end screening capability from perception, analysis to diagnosis with no or minimum human intervention and which minimizes inter-endoscopist variability. Given that the design parameters of the proposed have direct impact on the final performance, modeling analysis for working mechanism of rat whisker is crucial for optimizing the following design and implementation.

Modeling and design parameter optimization

Much research in biology and neurosciences22 has proven that the sensing capability of whiskers mainly relies on follicles, in which the whisker is rooted. As shown in Supplementary Fig. S1, the anatomical structure of the follicles consists of several mechanoreceptors, which encodes several parameters, such as curvature, taper, elasticity, texture, and distance, into the generated electrical signals. Typically, a standard bionic design simulates the function of the follicles by designing different transduction solutions with a fixed whisker made of multifarious materials; however, the different working mechanisms of an artificial whisker-based system will have direct impacts on the targeted tasks and contribute to different responses and performances. In general, there are two types of working mechanisms including the static model and the dynamic model as shown in Fig. 2A–D, which refer to behaviors of the whisker during the sensing period.

Fig. 2: The modeling analysis of an artificial whisker-based system, including static and dynamic models, which provides insights into its working mechanism.
figure 2

A The static model of the whisker system. Generally, the artificial whisker can be modeled as a cantilever beam with a free end, which satisfies the Bernoulli-Euler equation. The static indicates the fixture is fixed during the sensing stage. B An analytical solution of the localization accuracy problem in the static model. This design will lead to indistinguishable flexion angle changes on which the system relies to complete distance perception as the distance between the whisker and the target increases. C A motor-driven dynamic model of whisker system, which can provide more robust sensing capability while at the cost of large volume and difficult to be highly integrated. D A variant dynamic model utilizing external movement force to drive it, featured by higher integration and robustness.

Normally, according to the theory of mechanics of materials23, the artificial whisker can be modeled as a cantilever beam24,25,26 with a free end under the both static and dynamic conditions, which satisfies the Bernoulli-Euler equation. In the static model, as shown in Fig. 2A, analytical solution in the material and methods section demonstrates that the distance parameter d can be estimated by capturing the values of deflection angle θ and x from several different observation points in a single experiment, and such values can be measured by placing several sensors along the beam. However, one of the major limitations is the localization accuracy. Two sensors should be placed relatively close to each other along the neutral axis in order to improve system integration and reduce system volume. Such design will limit the ability of the sensors in measuring the deflection angle changes on which the system relies to work properly. Assume that the sensors are fabricated at x1 = 10 mm, x2 = 20 mm, which are the reasonable values taking the manufacturing technology into consideration, the simulation results shown in the Fig. 2B depict that under the condition of any predefined θ1, the variations of θ2, namely, Δθ2, tend to be zero degree as the increase of the detection distance. Due to the limitation of sensitivity, resolution, and inherent noise level of the sensor, these small deformations are hard to capture, and could even lead to stochastic sensing outputs with small range despite the variation of the distances d. From the perspective of the electronics, the system response caused by distance variation is indistinguishable and fused with the response caused by system noise, which leads to system malfunction ultimately. In addition, a single robust detection should be conducted when the system is in a static and stable state, which limits the quick screening potentials in the proposed application scenario. In summary, the multiple sensor deployment, sensing accuracy, system noise control, and relative low detection speed limit the utilization of a static sensing model.

In recent research, it has been shown that active whisking or any external movement plays an important role in rats’ space perception capability27. The whisking and forward/backward movement extend the dynamic range of the sensing capability, and provide a more robust sensory modality. A typical example of a dynamic model is shown in the Fig. 2C. The common strategy is a motor-driven solution in dynamic models. In this case, the beam is fixed at a fixture base which is connected to a motor to provide the rotation force, namely, rotation torque. According to the analytical solution in the material and method section, although the size of fixture introduces a more complicated relationship to parameters of interested, inevitably, the model is capable of providing adequate relevant information to estimate the distance with only one sensor which can measure torque. Although the angle α is not a predefined value, it depends on the termination point of the fixture where the whisker is inserted, and the target object to be detected. Hence, a series of pre-collected signals could be used to calibrate the sensor. Moreover, based on an assumption, that the beam connected to the actuator is at the center of rotation to simplify the calculation, a more succinct analytical solution could be obtained, and the influence caused by the angle α can be eliminated, which means the detection feasibility and effectiveness are promoted dramatically if the measurement is within an allowable error range. Except for boosted robustness in terms of the dynamic model, a single detection can be conducted with several trials by adjusting the rotation torque to generate different responses, which leads to more reliable detection.

The dynamic model will yield a more robust sensing comparing to a static model, however, in our design, for the proposed to be integrated into clinical diagnostic devices as an add-on submodule, motor-driven solution is not an appropriate approach given its size and biocompatibility issues. In our design, an external-driven movement approach is adopted, which is a variant of the active rotated dynamic model. Our analytical calculation has also shown that this external-driven model has comparable performance compared with the rotated one, and one observation point along the beam is sufficient to provide distance information; however, the calculation is more complicated, and which could induce extensive workload for the central processing unit (CPU) of the system and reduce the accuracy. To optimize this novel sensing modality and make it simpler and more robust in the clinical GI disease screening, we mitigate the dependence on absolute positioning accuracy in experiments, while emphasize the sequential information in a single continuous whisking movement, which corresponds to the idea of distance perception rather than accurate localization. Based on this idea, a deep learning algorithm is developed for nonlinear parametric mapping in terms of the output signal and the parameters of interest, such as textural information, hardness, and distance estimation, in this way, to translate the output into the relevant information to assist clinical diagnosis. To acquire this sequential scanning information, the proposed whisker-based system should be characterized by rapid response capability and large dynamic range, and which guarantee high-fidelity structural and textural detection.

System design and implementation

Based on the modeling analysis, a dynamic model-based whisker-based system with low noise level, rapid response capability, and large dynamic range is needed. As such, the system consisting of a whisker-based sensor and signal conditioning circuit is designed and developed.

As aforementioned, a whisker-based sensor generally consists of a signal transduction solution that mimics the follicles, and a whisker, both of these are the prerequisite for achieving high-level bionics. Many signal transduction solutions have been proposed to simulate the function of follicles, which range from electret microphones28, and strain gauges18,29, to hall-effect sensors array30,31,32. Due to the different sensory principles, these signal transduction solutions have their own unique advantages, as well as limitations. For instance, the electret microphones solution is sensitive to deformation of whisker during the contact, whereas the system volume is too large to be integrated into a system, such as those for minimally invasive surgical applications. A similar problem also exists in the hall-effect sensor. The hall-effect magnetic sensor array, which normally includes two pairs of orthogonally placed sensor chips, converts the deformation angles of the whisker (where an external permanent magnet is attached) into magnetic flux measurements. Conversely, the strain gauges overcome this problem to some extent. Based on the piezoresistive effect, strain gauges can effectively transform mechanical deformation and vibrations into small resistance variations. In this case, it is a common practice to use a full bridge configuration of the Wheatstone bridge to achieve temperature compensation33 and provide enhanced linear response capability. (More details regarding the application fields, transduction solutions, and characteristics of whisker works in existing research can be found in Supplementary Table S1 of the Supplementary Material). In our design, the signal transduction solution is developed based on a Polyvinylidene fluoride (PVDF) film transducer. The PVDF transducer, categorized as a type of piezoelectric sensor, is a semi-crystalline polymer that operates based on the piezoelectric effect. This effect is characterized by the generation of an electric charge proportional to the stress applied when the film is subjected to vibrations or deformation, resulting from polarization of the molecules in the polymer. PVDF transducers possess remarkable features, such as low output capacitance, rapid response capability, wide frequency response range, high mechanical strength, and stable chemical properties, making them widely adopted in both research and industrial applications. However, the generated signal of the PVDF film transducer is ultra-weak charge quantity variation with high impedance. A pre-processing circuit which mainly consists of a charge amplifier, hence, is designed for impedance matching, as well as converting the original charge signal into a voltage signal for signal conditioning. Figure 3A illustrates the equivalent circuit design of the PVDF film transducer with the charge amplifier schematic. The analytical solution demonstrates that the selection of the Cf requires a compromise between signal amplification and signal accuracy. In this design, to extract the detailed textural and structural information, the signal accuracy and a wide frequency response range are of higher priority, thus, a relatively large Cf capacitor is selected. To compensate for the limitation of the insufficient amplification gain, a high-precision signal conditioning circuit is proposed with 24-bits analog-to-digital conversion resolution.

Fig. 3: The design of the proposed hardware system for artificial whisker-based sensing, which comprises a whisker-based sensor, signal conditioning circuits, and the embedded control algorithm.
figure 3

A A pre-processing circuit (labeled as 3 in B), a charge amplifier, is designed to achieve impedance matching, as well as the charge to voltage signal conversion. B The structure of the whisker-based sensor. In this design, a guitar string, consisting of a thin core with wire wrapped around it, is used as a whisker shaft. A PVDF film is used as a transducer utilizing piezoelectric properties given its large dynamic range. C The structure of the signal conditioning circuit contains a 3D printing upper shell (labeled as 1), a PCB board of the signal conditioning circuit (labeled as 2), a Li-ion battery (labeled as 3), and a 3D printing lower shell (labeled as 4). D Flowchart of the embedded control program for command transmission and data exchange. E The framework of the signal conditioning circuit, which contains an AFE system, MCU system, and a power supply system. SPI protocol is used for timing control, data/instruction interaction between the MCU and the AFE system.

Apart from the PVDF film transducer and the pre-processing circuit, a whisker is needed for constructing a biomimetic whisker-based sensor. The design of the whisker varies considerably, with solutions ranging from stainless steel wires18, specially molded composite34, to real rat whiskers35. In this work, a guitar string, which typically consists of a thin core with wire wrapped around it, is adopted as a whisker. The main advantages of this material are high dynamic response and strong flexibility. Figure 3B and Supplementary Fig. S3B show the prototype of the entire whisker-based sensor. The assembly process of the prototype relates to the soldering and encapsulation steps. The pre-processing printed circuit board (denoted as 3) is designed for the signal forward and backward transmission, which is soldered onto the post-stage signal conditioning circuit board, which is introduced in the following section, through signal sockets at the lower edge. On the upper side of the module 3, a PVDF film sensor (denoted as 2), from TE Connectivity®, is also soldered onto it through the electrode pins of the PVDF, which provides a part of the adhesion strength. Furthermore, a whisker is glued and fixed on the surface of the PVDF film to ensure that the whisker can be freely deformed.

The generated signal of the whisker-based sensor is ultra-weak analog voltage signals, which requires further processing for capturing and analysis. As shown in the Fig. 3E and Supplementary Fig. S3C, a post-stage signal conditioning circuit is developed for this purpose. Generally, the signal conditioning circuit can be built with discrete components, however, by utilizing an analog front end (AFE) technology and a microcontroller unit (MCU), as well as compact peripheral passive and active components, the signal-to-noise ratio can be enhanced remarkably36. Specifically, the analog response signal is fed into an anti-aliasing circuit at first to remove the out-of-band noises. Subsequently, the output signal is routed into an analog-front-end subsystem to complete signal re-routing, multi-stage amplification, analog-to-digital conversion, and low-pass filtering processes. In this way, the original deformation signal is converted into digitalized waveform signal for context interpretation. A 24-bits high-precision ADC is designed to mitigate the limitation of the insufficient amplification gain as well as providing sufficient resolution for sensing the fine textural and structural details. All the signal processing mechanism mentioned before are controlled by the MCU. Figure 3D shows a workflow of the embedded control algorithm. Through the serial peripheral interface (SPI), the MCU provides the needed functions of timing control, data transmission, instruction control, etc. Moreover, by combining programmable logical components, the parameters of the signal conditioning circuit can be reconfigured, such as gain, sampling rate, and channel re-routing. In addition, 3D printing technology is used for the packaging of the whisker-based system. The structure of the proposed signal conditioning circuit is shown in Fig. 3C and Supplementary Fig. S3A. The prototype adopts a stacking architecture. From top to bottom, there are 3D printing top cover (denoted as 1), PCB board of the signal conditioning circuit (denoted as 2), Li-ion battery (denoted as 3), and the 3D printing base cover (denoted as 4). The Li-ion battery module is soldered onto the socket of the signal conditioning circuit, and the same connection method is used to integrate the whisker-based sensor with the signal conditioning circuit. Furthermore, a water-proof glue is applied to glue and seal the prototype, which ensures the prototype can work in environment with high humidity and even submerged in water.

Benchmark experiment 1: the inherent noise evaluation

The electrical performance of the proposed whisker-based system is the basis for acquiring high-fidelity signals, which are the main focuses of the following experiments, such as texture discrimination, etc. Firstly, the inherent noise test, as a crucial electrical performance index, is conducted to assess the thermal effects on the system. As the results shown in the modeling analysis, low noise level is critical for robust precise sensing and performance of the proposed method. The inherent noise level can also limit the signal-to-noise ratio and bandwidth of the proposed system, both of which will directly affect the dynamic performance of the proposed system. Typically, the noise is caused by the internal voltage noise and current noise and is highly related to temperature fluctuation. In this test, the input channels are shorted to the common-mode voltage, and the temperature is ranged from 0 °C to100 °C with a step of 10 °C. For each temperature test point, 5 trials were conducted that contain 50,000 consecutive conversion points for each with a sampling rate of 250 Hz. The root means square (RMS) of the output noise signal given below is used to assess the average noise level, where vi and N represent the single conversion point of digitalized output signals and the number of conversion points respectively. In addition, peak-to-peak noise (Vpp) is another index to assess the noise distribution.

$${v}_{{rms}{\rm{\_}}{of}{\rm{\_}}{noise}}=\sqrt{\frac{1}{N}\mathop{\sum }\limits_{i=1}^{n}{v}_{i}^{2}}$$

Experimental results, as shown in Fig. 4A, reveal that the inherent noise is relatively stable in the working temperature range (10–50 °C) with a maximum of 0.75uVrms and 6.16 uVpp. However, the noise increases with temperature, which matches the expectation, since electronic thermal noise increases as the temperature rises37. More details of the noise waveforms can be found in the Supplementary Fig. S4 in Supplementary Material. The test results indicate that the magnitude of the inherent noise is negligible and will have minimal effect on the signal quality. Meanwhile, a power dissipation issue, which is normally in the form of thermal generation, should also be taken into consideration as high-temperature could cause tissue burns and scalds. A commercial infrared imaging device (Fluke® TiX580 Infrared Camera) is used to evaluate the thermal distribution of the system. Figure 4B shows the test results, and the background temperature is used as a reference. The test results demonstrate that the whisker temperature rises, caused by thermal dissipation of long-term operation, less than 3.1 °C during the 9 h of continuous usage tests, and which shows that the proposed system can be used safely in surgical applications.

Fig. 4: The proposed hardware system exhibits enhanced precision and robustness attributed to its low inherent noise characteristics and low thermal dissipation.
figure 4

A The boxplot results of the inherent noise evaluation with temperature changes. B The peak-to-peak noise (Vpp) and RMS noise distribution of noise evaluation task. C Visualization of thermal dissipation issue over 0, 3, 6, and 9 h of operation for the proposed whisker system, captured using a commercial infrared imaging device (Fluke® TiX580 Infrared Camera). The numbers at the center of the infrared figures indicate temperature readings in Fahrenheit at the respective measurement points.

Benchmark experiments 2: core function evaluation

In the last section, the whisker-based system performance is evaluated from the perspective of electrical characteristics. To evaluate the potential of using the proposed system for surgical applications, a comprehensive test protocol is designed to assess the task-related performance, including texture, distance, stiffness, and shape features sensing. In the real world, the perception and navigation capability of the rat whisker obviously relies on its advanced central nervous system to decode and translate the raw signal into the one it can understand, even with the presence of other sensory modalities like the vision simultaneously. Although it is not clear how these processes are conducted, the different signal features could still be inspected and extracted from the signal collected by the proposed system according to the detection principles of the PVDF-based sensory solution. Supplementary Table S2 summarizes the detection principles for different sensory dimensions. By comparing the signal from the perspective of time domain, frequency domain, and time-frequency domain, the texture discrimination, distance perception, stiffness characteristics, and shape recognition can be validated.

Since the proposed whisker-based system works as a variant of the rotational dynamic model, an external actuator is required to drive the whisker and enable it to sweep across an object. To achieve accurate control, a 3D printer is utilized as an experimental platform, as shown in Supplementary Fig. S5, to provide this actuation. A 3D printer typically consists of subsystems such as Frame, Print bed, Extruder, Stepper motors, Control board, among others. The print bed provides precise movement in the Z-axis, while the Stepper motors provide precise movement in the XY-axis. During experiments, we attached the whisker-based system to the Stepper motors module and placed the object to be tested on the print bed. By controlling the software interface, the proposed system can be driven with high precision. In addition, given that the whisker is analog to the tactile sensing of the human fingertips, the sweeping speed will directly affect the response signal. Empirically, a higher sampling rate with slow movement speed results in a precise response with more details; however, the control parameters setting shall be adjusted according to the corresponding tasks to highlight some waveform characteristics. The detailed settings are described in the following sections.

Texture discrimination

Texture characterizes the consistency of the micro structural elements of a surface. Textural feature can be distinguished by the signal waveforms, frequency components, and time-frequency distribution from signal captured by the proposed system. Three types of materials were chosen in the experiment including an adhesive tape with a smooth surface, sandpaper, and a soft fabric cloth, all of which are shown in Supplementary Fig. S6. During the test, the height between the object surface and the whisker base was set to be about 56 mm. At the first batch of trials, the movement speed was set to 3000 mm/min, and the whisker just used less than 1 s to sweep over the surface of the object, so that there is no obvious difference in the output signal among those materials. This is reasonable based on the assumption that the whisker sensing is similar to the sensing of human fingertip, which means a short duration of interaction is not enough to acquire sufficient information. A series of comparative analysis experiments were conducted and which indicated that a relatively low movement speed can acquire more abundant textural information, and here, the test results with 300 mm/min movement speed are shown in Fig. 5.

Fig. 5: Texture discrimination capabilities are demonstrated by notable differences in the sweeping stage across the time domain, frequency domain, and time-frequency domain perspectives.
figure 5

A The responses of smooth surface. From the top to bottom are the results of the signal waveforms, frequency components, and time-frequency distributions respectively when the proposed system sweeps across the surface of the target material. The arrangement of the rest (B, C) is the same. B The responses of sandpaper. C The responses of soft fabric cloth.

Specifically, the initial output baseline signal of the proposed system is 2450 mV, which corresponds to the static condition. When the whisker is approaching the target object, several small vibrations will be induced by external movements and which generate some noise superposing onto the baseline. Subsequently, the signal will rise up starting at the first contact between the whisker and the object. When the deformation reaches the maximum value, which corresponds to the state where the whisker tip is completely on the surface of the object, the signal amplitude will then start to decrease in the sweeping stage. The last stage occurs when the whisker leaves the object surface, a very significant oscillation signal will be generated due to the stored elastic potential energy, and this release point signal indicates the end of this interaction between the proposed system and the target objects. From Fig. 5, a prominent difference in the time domain waveforms among these three types of output signals can be found in the sweeping stages, which correspond to the declining curves. The smoother the surface will lead to smoother output signal. Furthermore, since the fabrics have different warp and weft lines, the whisker changed its forward track occasionally during the sweeping stage, which corresponds to the ‘w’ shape in the output signal as shown in Fig. 5C. In addition, the corresponding frequency domain and time-frequency distribution also support the same conclusion. It is worth mentioning that even though the difference between the materials can be verified, there is high consistency among the categories as shown in Supplementary Fig. S6, which was the responses of backward movement in the textural discrimination tasks.

Distance perception

In the distance perception task, the control variable is the distance between the target object and the base of the whisker. The movement speed was set to 1500 mm/min since the textural information is not the focus in this experiment, and test height was set to 48–57 mm with a step value of 1 mm. For each distance setting, 3 trials were conducted. Figure 6 shows a test sequence with different test heights. The test results have shown that the release point oscillation signals show a highly linear relationship with the height parameter. Based on the release point oscillation signals extracted by a peak detection algorithm, first-order polynomials and cubic polynomials can fit these features smoothly, and statistic indexes, as shown in Table 1, such as Sum of Square due to Error (SSE), Coefficient of determination (r-square), Root Mean Square Error (RMSE), and adjusted Coefficient of determination (adjusted r-square), are commonly used to assess the fitting performance. The results demonstrate that the polynomials fit the oscillation signal very well. In addition, from the perspective of the velocity signals, which is defined as the derivation of the original signals, this linear relationship is also maintained with the polynomial functions. The slightly fitting difference between the original signals and the velocity signals is negligible according to the statistical indexes.

Fig. 6: The strong correlation between the detection distance and response signals validates the distance perception capability of the whisker-based hardware system.
figure 6

A The experimental setup during the task. The proposed system is mounted on the 3D printer, and the distance is controlled precisely by a dynamic heated bed. B The response signal generated by the proposed system with different controlled distances. The upper waveform shows the raw signal while the bottom one is the velocity derived from the raw signal. C The polynomial fitting results for distance estimation based on release point signals. D The polynomial fitting results for distance estimation based on velocity signals.

Table 1 The statistical fit indexes (SSE, R-Square, RMSE, Adjusted R-Square) for assessing the fitting of distance perception task using peak point and velocity signals.

Stiffness characterization

In general, different tissues have different stiffness, and tissue stiffness is another biomarker commonly used for health assessment, since some diseases, such as cancer, could affect tissue elasticity and stiffness. Given the tactile sensing capability of the proposed, the stiffness characterization ability of the whisker-based system was evaluated. During the test, ex-vivo samples (i.e., soft tissue and bone) were used for the stiffness test, which is shown in Supplementary Fig. S7. By adjusting the height of the print bed to keep the distance between the object and whisker base equally, different responses were obtained using the proposed system, which are shown in Fig. 7A, B. Similar to the distance perception task, the value of the release point is significantly different for soft tissue and bone, while the first fundamental frequency point is almost the same. This fundamental frequency highly depends on the material or inherent mechanical characters of the whisker, which will be discussed in the discussion section.

Fig. 7: The data visualization demonstrates the stiffness characterization and shape recognition capabilities across the time domain, frequency domain, and time-frequency domain perspectives.
figure 7

For each column, the results are arranged in the order of signal waveforms, frequency components, and time-frequency distributions respectively from the top to bottom. A The response of the soft tissue during the stiffness characterization task. B The response of the bone during the stiffness characterization task. CE The response of round, flat, and bevel shapes respectively during the shape recognition task. The objects used for tests with different shapes are designed and fabricated by 3D printer technology.

Shape recognition

In the texture discrimination task, the sensing focused on the sweeping stages and for the other two tasks mentioned before, the whisker leaving point signal was the main focus for sensing the characteristics. In contrast, in recognizing shapes, the entire interaction duration (i.e., from the first contact to the object, to the sweeping across the surface of the object, as well as the point when the whisker leaves the object) shall be measured; thus, shape recognition task is a relatively more comprehensive assessment. During the test, three objects were designed and fabricated using a 3D printer with round, flat, and bevel surface as shown in Supplementary Fig. S8. Time domain waveform can be used to classify these objects from the whisker sensing signals as shown in Fig. 7C–E. Specially, the signal generated from a round surface is relatively symmetric, which gradually reaches the maximum amplitude, and decreases for a short period, followed by a sharp drop. Flat surface has a long decent process, since the whisker would take longer to sweep through the surface of the model. The signal collected from a bevel surface show a rapid shift from the rising trend to a sharp drop, since the leaving point was right after the whisker was maximally deformed.

Pilot study towards clinical application

To assess the clinical application potentials of the proposed method, a pilot study, (i.e. abnormal tissue screening) was conducted after the benchmark performance tests of the proposed whisker-based system. A phantom (Model: FC-GHGH, Guangzhou LanDie Teaching Model Co., Ltd.,) was used for the abnormal tissue screening task as shown in Fig. 8A and Supplementary Fig. S9. Three typical biological tissues, which include normal tissue, ulcerative colitis, and ulcerative rectal cancer, were the screening targets. Although the phantom is far from a real intestinal tract, the pathological anomalies have their own unique and consistent topologies and surface structures. Hence, this proof-of-concept experiment is indicative for evaluating the feasibility of this novel method. During the experiments, the phantom was fixed on the workbench, and the whisker-based system swept across the surface of the phantom in hand-held states, which aim to simulate the real screening procedure where the system is assumed to be operated by clinicians or robotic systems, instead of an ultra-stable actuator that moves along its pre-defined trajectory. A total of 120 trials were conducted for each type of biological tissue, and each trial contained a complete screening sequence from system’s first contact to leave the object surface.

Fig. 8: Outcomes of the proposed method in the pilot study for assisting clinical application.
figure 8

AC are the workflow of the pilot study. A Data collection and pre-processing stage. The raw signal collected by the whisker system will be filtered, zero-centralized, and normalized. B Deep learning algorithm design, which is used for translating the raw signal into diagnostic outcomes. The proposed end-to-end algorithm consists of 4 stacked CNN layers, 2 layers of LSTM layers, and a linear layer. C Clinical decision-making stage, which interprets the probability distribution from the output of the B to the clinical decision. D The distribution of original response signals obtained from the proposed system is visualized using the t-SNE algorithm. EH Confusion matrixes of different network structures, which correspond to 421, 401, 201, and 001 model listed in Table 2.

The algorithm is crucial to translate the raw signal into relevant information for clinical diagnosis. First, the raw signal collected by the whisker-based system is zero-mean and normalized to the range of (−1, +1) to scale the features into the same range, which is needed for a deep-learning training to converge and the network to be optimized. Subsequently, a deep neural network consisting of 4 stacked convolutional neural network (CNN)38 layers and 2 layers of long short-term memory (LSTM) was designed for the classification tasks, which is shown in Fig. 8D. 1-dimensional CNN layers were used for feature extraction, meanwhile, batch normalization, maxpooling, and dropout techniques were also adopted to reduce the feature dimension and improve the generalizability of the network. Compared with traditional recurrent neural networks (RNNs)39, LSTM overcomes the short-term memory effect of RNNs and is capable of learning signal patterns from long-time series. This long-time series learning capability is achieved by using the gate control mechanism40.

The 5-fold cross-validation method is used in assessing the algorithm’s performance, which means 96 samples from a total of 120 samples are used as the training set, while the rest is used as the test set in each fold. The t-SNE41 visualization as shown in Fig. 8B indicates that the original signal distribution is chaotic. However, after 65 epochs of training, the test accuracy can reach 100% for some trials, and the averaged accuracy in entire database can reach 94.44% with 0.9167 kappa. Moreover, to design the proposed to be an add-on submodule for a clinical device, the computational complexity is another important issue. Deep learning algorithms often require large amounts of memory and consume a lot of computing resources, compromise between the accuracy and computational complexity should be considered. In our experiments, an ablation study is conducted to assess this issue. By reducing several layers from current algorithms, different classification accuracies and other performance indexes can be obtained, which is shown in Table 2. The results demonstrate that even the network is simplified into a single linear layer, and it still can get an averaged accuracy of 85.83%, while the algorithm just keeps around 1000 parameters involved in calculation. This ablation study indicates that this novel tactile modality-based method could be a promising method for clinical diseases screening without much dependency on specific algorithms. In the future, a large-scale database can be built through extensive clinical experiments to train the deep learning algorithms, and a fast-screening procedure based on this new sensing modality could be developed and even integrated into surgical robots or endoscope systems with enhanced disease detection ability, higher accuracy, and improved robustness.

Table 2 Performance differences with different network structures.

Discussion

A novel artificial whisker-based method is proposed for GI diseases screening. The design concept of the proposed method is relatively straightforward, but the extensive studies and experiments were conducted to refine the design and develop the early prototype. The results of the experiments have shown the feasibility and robustness of the proposed method. However, some potential issues ranging from the design to the experiment are discussed as follows.

Whisker characteristic

Different lengths, materials, diameters, and geometry structures will affect the response of the whisker-based system. During the preliminary experiments, the results indicated that the signals generated by the whisker with the same material but with different diameters would yield different results at the first fundamental frequency point as shown in Supplementary Fig. S10. Further investigation should be performed in the future, such as response consistency among the different whiskers. Meanwhile, the impact of the whisker characteristics (i.e. material, length, diameter, etc.) to the targeted classification tasks, i.e. disease screening, should be assessed to find the optimal design. In addition, the whisker material should be biocompatible, robust, and non-degradable by intestinal fluid and electrolytes in the intestinal tract.

The whisker geometry distribution

In this proof-of-concept design, a single whisker-based system was fabricated and assessed for potential applications. As expected, to provide a detailed screening of the gastric tract, the system shall be extended into a multi-whiskers system. However, the optimal geometric distribution of the whiskers is still yet to be determined. Radial distribution as shown in Fig. 1, along the arc shell of a medical device, such as endoscope, or as shown in Supplementary Fig. S11 on the part of the device are some viable options.

Motion speed

A whisker sensor is a type of tactile sensor, and its sensory mechanism is similar to fingertips. This means that different sweeping speeds will result in different responses. In other words, the faster the sensor moves, the less detail it can capture, which is similar to a conventional scanner. This phenomenon is caused by the limitation of the sampling rate of the nerve endings on the fingertip, as well as the processing speed of the nerve center. However, by adjusting the sampling rate of the whisker-based system, it can achieve a faster screening with reasonable detection accuracy. Moreover, the sampling rate should be refined and balanced between the sweeping speed and power consumption with respect to the targeted task. For instance, if the system is aimed for distance perception tasks, the sensing should be focused on the maximum deformation signal, rather than the signal details.

Abnormal tissue screening

During the abnormal tissue screening task, a phantom was used in the experiment. As mentioned, the phantom is still quite different from a real intestinal tract, but as pathological tissue abnormalities have quite obvious differences in topologies and surface structures, the proposed system should be able to distinguish and identify the abnormal tissues as shown in evaluation results. In the future, large datasets should be collected to train deep learning models would enable rapid and precise screening allowing the proposed to be translated for clinical use.

Method optimization

To introduce this novel method into clinical practice, the proposed shall be optimized for the targeted applications. First of all, the size of the whisker-based system should be miniaturized and designed to be compatible with existing surgical equipment, such as endoscopes and surgical robotics. It means that this system could be integrated as a subsystem of the current medical devices to assist diagnosis and treatment. Meanwhile, it is essential to integrate multiple whisker sensor units in one system to enable accurate screening, and which could potentially map the intestinal tract into a 3D structure for better visualization and analysis. In addition, the system shall be safe, waterproofed, EMC (Electromagnetic compatibility) and ESD (electrostatic discharges) compliant and biocompatible for it to be used in surgery.

Conclusion

In this paper, a biomimetic whisker-based method is proposed for gastrointestinal diseases screening. By combing hardware design and end-to-end deep learning algorithms, this tactile perception-based method shows great potential in extracting the structural and textural information of the GI tract, which could potentially enhance or complement the current vision-based endoluminal diagnosis. The highly reconfigurable and integrated hardware design, coupled with the low computational power requirements of the algorithm, implies that this method could be incorporated as a standalone add-on subsystem with current endoscopic robotic surgery platforms, and potentially form a new screening routine based on enhanced algorithms trained with a large-scale database. Further investigation on the physical and mechanical properties of the whisker and the design of multichannel hardware systems could facilitate the optimization of the method. In addition, a strain gauge-based transduction solution is a promising candidate to complement the current PVDF-based solution, particularly with respect to providing static sensing capability.

Methods

Analysis of the static model

In the equivalent static model as shown in Fig. 2A, a concentrated force is applied to beam at the position of d along the x axis with an amplitude of P in the downward direction.

According to the elastic beam theory23, the beam equations listed below can be derived:

$${EI}{y}^{\prime\prime\prime}=-{Pu}\left(x-d\right)+{C}_{1}$$

Where y is the deflection of the neutral axis of the beam (i.e., the whisker in the artificial whisker-based hardware system). x is an observation point, which corresponds to the deformation point to be measured. d is the distance from the fixed point of the beam to the point where the force is applied, and C1 is a coefficient to be determined. E is Young’s modulus of elasticity of the beam material, which characterizes the stiffness of the material. I is the cross-sectional area momentum of inertia, which is defined as below, where r and V are the vertical distance from the prime element to the rotating axis, and the integral range, which is the entire rigid body, respectively.

$$I={\iiint_{V}}{r}^{2}{dm}$$

In other words, EI is a constant that characterizes the beam inherent features. In addition, the function u is a step function, as shown below:

$$u(t)=\left\{\begin{array}{ll}1\,t\, > \,0\\ 0\,t\,\le \,0\end{array}\right.$$

After multiple integration,

$${EI}{y}^{\prime\prime}=-P\left(x-d\right)u\left(x-d\right)+{C}_{1}x+{C}_{2}$$
$${EI}{y}^{{\prime} }=\frac{-P{\left(x-d\right)}^{2}u\left(x-d\right)}{2}+\frac{{C}_{1}{x}^{2}}{2}+{C}_{2}x+{C}_{3}$$
$${EIy}=\frac{-P{\left(x-d\right)}^{3}u\left(x-d\right)}{6}+\frac{{C}_{1}{x}^{3}}{6}+\frac{{C}_{2}{x}^{2}}{2}+{C}_{3}x+{C}_{4}$$

Utilizing the boundary conditions listed below:

$$\left\{\begin{array}{ll}{y{\rm{|}}}_{x=0}=0\\ {{EI}{y}^{{\prime} }{\rm{|}}}_{x=0}=0\\ {{EI}{y}^{\prime\prime}{\rm{|}}}_{x=L}=0\\ {{EI}{y}^{\prime\prime\prime}{\rm{|}}}_{x=L}=0\end{array}\right.$$

The coefficients can be obtained as: C1 = P, C2 = −dP, C3 = C4 = 0. Therefore, the formula above can be rewritten as:

$${EI}{y}^{{\prime} }={EI}\frac{{dy}}{{dx}}={EI}\,\tan \theta =\frac{P}{2}{x}^{2}-{Pdx},x\le d$$

where the θ is the deflection angle or bending angle at the point x along the neutral axis.

As shown in the formulas, the distance parameter d can be estimated by retrieving the observation values of θ and x from several different observation points in an experimental trial, which can be achieved by placing several sensors along the beam.

Assume that a pair of sensor data is (x1,θ1) and (x2,θ2), then, the estimated force P is:

$$P=2{EI}\frac{{x}_{1}\tan {\theta }_{2}-{x}_{2}\tan {\theta }_{1}}{{x}_{1}^{2}{x}_{2}-{x}_{2}^{2}{x}_{1}}$$

And therefore, the analytical solution of distance d is:

$$d=\frac{\frac{\tan {\theta }_{2}}{\tan {\theta }_{1}}{x}_{1}^{2}-{x}_{2}^{2}}{2(\frac{\tan {\theta }_{2}}{\tan {\theta }_{1}}{x}_{1}-{x}_{2})}$$

As the detection distance increases, the ratio of the deflection angle of two separated sensing point is approaching to a constant value, which can be found from:

$$\mathop{\mathrm{lim}}\limits_{d\to {\rm{\infty }}}\frac{\tan {\theta }_{2}}{\tan {\theta }_{1}}=\frac{0.5{x}_{2}^{2}-d{x}_{2}}{0.5{x}_{1}^{2}-d{x}_{1}}\approx \frac{{x}_{2}}{{x}_{1}}=\xi$$

Combined with the formulation of d, it indicates that the distance d is a relative stable value wherever the object is located. From the perspective of the electronics, the system response caused by distance variation is indistinguishable and fused with the response caused by system noise, which leads to system malfunction ultimately.

Analysis of the dynamic model and our strategy

In the dynamic model, the basic elastic beam theory still works; however, different boundary conditions should be taken into consideration. By inserting the boundary condition listed below,

$$\begin{array}{ll}{y{\rm{|}}}_{x=0}=0\\ {{EI}{y}^{\prime\prime}{\rm{|}}}_{x=0}=-\tau \\ {{EI}{y}^{\prime\prime}{\rm{|}}}_{x=L}=0\\ {{EI}{y}^{\prime\prime}{\rm{|}}}_{x=L}=0\\ {y{\rm{|}}}_{x=d}=0\\ {{EI}{y}^{\prime\prime}{\rm{|}}}_{x=d}=0\end{array}$$

A set of new coefficients can be deduced:

$${C}_{1}=\frac{\tau }{d},{C}_{2}=-\tau ,{C}_{3}=\frac{\tau d}{3},{C}_{4}=0,P=\frac{\tau }{d}$$

Therefore, a modified equation is:

$${EI}{y}^{{\prime} }={EI}\frac{{dy}}{{dx}}={EI}\,\tan {\theta }^{{\prime} }=\frac{\tau }{2d}{x}^{2}-\tau x+\frac{1}{3}\tau d,x\le d$$

where τ is a torque generated by the self-rotated motor at the observation point x, and \({\theta }^{{\prime} }\) is an updated deflection angle.

In a practical model, which takes the physical volume of a fixture into consideration, we define an angular shift α formed by rotation as shown in Fig. 2C, which is a deviation of the fixture base from the initial position of the neutral axis between the object and the rotation axis. In addition, β is the angle of the observation point relative to the initial neutral axis, namely, the horizontal axis. According to the deduction of the similar triangles as shown in Fig. 2C, equation above can be reformulated as:

$${EI}\tan (\beta +\alpha )=\frac{\tau }{2d}{x}^{2}-\tau x+\frac{1}{3}\tau d$$

Due to the external angular deviation α introduced by physical volume of the fixture, a new pair of sensor data can be set as \(({x}_{1}=0,{\beta }_{1})\) and \(({x}_{2}=x,{\beta }_{2})\), then, by inserting them into the formula above and canceling the torque τ, the new formula is:

$$\frac{\tan \left({\beta }_{2}+\alpha \right)}{\tan \left({\beta }_{1}+\alpha \right)}=\frac{3{x}^{2}}{2{d}^{2}}-\frac{3x}{d}+1$$

Given that β1 is the angle at the fixture base, this external drive angular shift can be monitored and retrieved from a digitalized actuator itself. Meanwhile, the deflection angular can be measured by sensor is then \({\theta }_{{sens}}={{\rm{\beta }}}_{1}-{{\rm{\beta }}}_{2}\), which means only one sensor is required to provide adequate information to estimate the distance-coupled index.

As a form of simplified version of the practical model mentioned above, namely, the ideal model as shown in Supplementary Fig. S2, α is set to zero. In this case, the physical volume of fixture is ignored and the whisker is assumed to be placed at the rotation center, which simplifies the deduction and application in real word dramatically. Combing the condition of a sensor placed at the base of the beam, which means x1 = 0, the equation can be simplified as:

$${EI}\tan \left({\rm{\beta }}+\alpha \right)={EI}\tan {\rm{\beta }}=\frac{1}{3}\tau d$$
$$d=\frac{3{EI}\tan \beta }{\tau }\approx \frac{3{EI}\beta }{\tau }({\rm{small\; deformation}})$$

Our strategy is an external-driven movement approach, which is a variant of the rotated active whisking. In this case, the equation can be re-formulated as:

$${EI}\tan (\alpha -\beta )=\frac{\tau }{2d}{x}^{2}-\tau x+\frac{1}{3}\tau d$$

The above illustrates that the distance-coupled information can be extracted from two separate observation points along the beam; however, the calculation is complicated and is also induced significant load onto the CPU of the system. Here only one sensor fixed along the whisker is used for deformation sensing, and a deep learning algorithm \({\mathcal{F}}\) with nonlinear mapping capabilities is adopted for nonlinear parameterization in terms of the output signal and the parameters of interest, such as textural information, hardness, and distance estimation.

$${v}_{o}={\mathcal{F}}{(d\left(x,t\right))}$$

Inclusion criteria of transduction solutions

The main advantages of PVDF-based transduction are summarized as below:

  • Low output capacitance, which makes it easy to match with the output circuit.

  • Wide frequency response range, large dynamic range, high electromechanical conversion sensitivity.

  • High mechanical strength, excellent toughness and flexibility, and robust impact resistance, which make it suitable for flat and integrated design of a system with low fracture probability.

  • Stable chemical properties, high corrosion resistance, low water absorption, no toxic substances release, which make it suitable for implantable devices and in-vivo diagnosis.

Pre-processing circuit design

A pre-processing circuit which mainly consists of a charge amplifier is designed for converting the original charge signal into a voltage signal. Supplementary Table S3 lists the parameter details of the equivalent circuit as shown in Fig. 3A, which mainly consists of a network of the capacitors and resistors.

Specifically, uo refers to the output voltage of the operational amplifier (OPA). According to the Kirchhoff law, the current i can be calculated as:

$$i=\frac{{u}_{i}}{R}+\frac{{u}_{i}}{\frac{1}{{jwC}}}+(A+1)\frac{{u}_{i}}{{Z}_{f}}$$

where the ui and A refer to the input voltage and the open-loop amplification gain of the OPA, respectively. In addition, the R, C, and the Zf are the equivalent complex impedance, which can be calculated as below:

$$R={R}_{{PVDF}}\parallel {R}_{i}=\frac{1}{\frac{1}{{R}_{{PVDF}}}+\frac{1}{{R}_{i}}}$$
$$C={C}_{{PVDF}}\parallel {C}_{c}\parallel {C}_{i}={C}_{{PVDF}}+{C}_{c}+{C}_{i}$$
$${Z}_{f}={R}_{f}\parallel {C}_{f}=\frac{1}{{jw}{C}_{f}+\frac{1}{{R}_{f}}}$$

Given that,

$$i=\frac{{dQ}}{{dt}}$$

the output voltage uo can be rewritten as:

$${u}_{o}=\frac{-A* {jwQ}}{\frac{1}{{R}_{{PVDF}}}+\frac{1}{{R}_{i}}+{jw}\left({C}_{{PVDF}}+{C}_{c}+{C}_{i}\right)+(A+1)({jw}{C}_{f}+\frac{1}{{R}_{f}})}$$

Given the value range of some component parameters, such as the value of A ranges from 104 to 106, and the Ri, as well as RPVDF, is approximately higher than 106, the result can be given:

$$\frac{A+1}{{R}_{f}}\gg \frac{1}{{R}_{{PVDF}}}+\frac{1}{{R}_{i}}$$
$$(A+1){jw}{C}_{f}\gg {jw}\left({C}_{{PVDF}}+{C}_{c}+{C}_{i}\right)$$

Therefore,

$${u}_{o}\approx \frac{-A* {jwQ}}{(A+1)({jw}{C}_{f}+\frac{1}{{R}_{f}})}$$

Normally, the feedback resistor Rf is used for providing the DC path with a value higher than 106 Ohms. Combining the frequency response range of the OPA, which is given below:

$${f}_{L}=\frac{1}{2\pi {{C}_{f}R}_{f}}$$

The output of the voltage can be calculated as:

$${u}_{o}\approx \frac{-Q}{{C}_{f}}$$

Data transmission protocol

A wireless data transmission protocol is designed and developed for capturing the sensor data remotely (hence, potentially for implantable application), while the RS-485/232 communication capability is also maintained for the tethered design (i.e. for endoscopic applications), and which could be switched by choosing a predefined option in the software. The data frame is shown in Supplementary Fig. S12, which consists of header, number of channels, sampling rate, payload sections, CRC, and the end indicators.

Specifically, the number of the channel is set to 0x01, which is a preset value for identification purposes and for extending to a future multi-channels system. The sampling rate can be adjusted via SPI communication protocol from 250 Hz to 2 KHz. The payload is the interested deformation signal with the size of 3 bytes. The Cyclic Redundancy Check (CRC) is deployed for error checking, and the generating polynomial of the CRC is given below:

$$G\left(x\right)={x}^{8}+{x}^{5}+{x}^{4}+1$$

Deep learning algorithm

Convolution neural network (CNN) has been a popular approach for finding the optimal features for machine learning problems as shown in Supplementary Fig. S13A. By convoluting the input signals with artificial synaptic weights, it translates the original signal into high-dimensional and abstract feature vectors, which could be adopted as the basis for context awareness. As for the LSTM, normally, the input gate, forget gate, and output gate are core components in LSTM, all of which determine the flow and throughput of the data by the activation functions σ(*), for instance, the sigmoid function. As shown in the Supplementary Fig. S13B, the gate functions \({{\boldsymbol{g}}}_{f},{{\boldsymbol{g}}}_{i},{{\boldsymbol{g}}}_{o}\) is defined as,

$${{\boldsymbol{g}}}_{f}={Sigmoid}\left({{\boldsymbol{w}}}_{f}\left[{{\boldsymbol{h}}}_{t-1},{{\boldsymbol{x}}}_{t}\right]+{{\boldsymbol{b}}}_{f}\right)$$
$${{\boldsymbol{g}}}_{i}={Sigmoid}\left({{\boldsymbol{w}}}_{i}\left[{{\boldsymbol{h}}}_{t-1},{{\boldsymbol{x}}}_{t}\right]+{{\boldsymbol{b}}}_{i}\right)$$
$${{\boldsymbol{g}}}_{o}={Sigmoid}\left({{\boldsymbol{w}}}_{o}\left[{{\boldsymbol{h}}}_{t-1},{{\boldsymbol{x}}}_{t}\right]+{{\boldsymbol{b}}}_{o}\right)$$

where the \({{\boldsymbol{h}}}_{t-1},{{\boldsymbol{x}}}_{t}\) correspond to the state vector of the hidden layers of the artificial neural network and the input data, respectively. Different from the ht vector, which is related to the output vector, ct state vector is another internal state vector that is utilized as output under the control of the gate function \({{\boldsymbol{g}}}_{o}\). The definition of the \({{\boldsymbol{c}}}_{t},{{\boldsymbol{h}}}_{t}\) are shown as:

$${{\boldsymbol{c}}}_{t}={{\boldsymbol{g}}}_{i}{\widetilde{{\boldsymbol{c}}}}_{t}+{{\boldsymbol{g}}}_{f}{{\boldsymbol{c}}}_{t-1}={{\boldsymbol{g}}}_{i}\left[\tanh \left({{\boldsymbol{w}}}_{c}\left[{{\boldsymbol{h}}}_{t-1},{{\boldsymbol{x}}}_{t}\right]+{{\boldsymbol{b}}}_{c}\right)\right]+{{\boldsymbol{g}}}_{f}{{\boldsymbol{c}}}_{t-1}$$
$${{\boldsymbol{h}}}_{t}={{\boldsymbol{g}}}_{o}\tanh ({{\boldsymbol{c}}}_{t})$$

This gate control mechanism and dual state vectors design enable LSTM to have long-term memory capability, and are not prone to the gradient diffusion problem.