• Year - 2017
  • crying baby
      Crying is one of the major means of infants to communicate with their surroundings, and is intended to point out any distress and attachment needs to their caregivers. Automatic detection of a baby cry in audio signals can be used for various purposes – from every-day applications to academic research. In this project we have developed a deep-learning based algorithm for automatic detection of baby cry in domestic audio...
  • Sitting Posture Monitoring With Kinect

    Doing computer work with an incorrect posture can lead to headaches, neck pains and more. In this project, we propose a system that assists the user to sit at the computer correctly. To do so, we developed a real-time system that analyzes the sitting position of the user and alerts when it might be harmful.

  • Sparse Channel Estimation for Underwater Acoustic Communication

    The need for underwater wireless communications exists in many applications such as unmanned underwater vehicles, speech transmission between divers, defense and collection of data recorded at ocean-bottom stations.
    The project goal is to implement an underwater acoustic channel estimator based on sparsity assumption.

  • Image Compression Through Multi-Scale Learned Dictionaries

    Data compression denotes the task of representing information in a compact way
    so it can be stored and transmitted e ciently. In the case of lossy-compression,
    this process may discard some information so that the reconstructed data is similar
    enough to the original one, by compromising between accuracy and le size. Image
    compression has a huge importance in a world where image resolution capabilities
    of digital devices are constantly growing. Therefore , e cient and practical image
    compression algorithms are of great concern where the goal is to produce higher
    quality images with smaller le sizes.

  • Eye-Tracking by EOG Signals

    In last past years many eye tracking technologies were developed. This features letting disabled people do their daily tasks and communicate with the surroundings, find disease like autism in little babies that cannot communicate verbally yet, improve sportsman achievements and even conducting advertisement research on people’s point of interest at specific advertisement.

  • EEG Sensor Fusion For 
Epileptic Seizure Detection

    EEG sensors are used to capture electric waves from the human brain. One of its main purposes is to sample signal from Epilepsy patients, to sample their seizures. Nowadays, seizures detection done manually by professional physicians, which examine EEG signals and recognize when the seizures accrued. There is a need to build automated algorithms to identify seizures. Accurate evaluation, pre-surgery assessments, and emergency alerts for medical aid all depend on the detection of the onset of seizures.

  • Red-Green Pedestrian Traffic Light Detection

    Our project idea came from a group of students at the Technion with visual impairment. They described a problem where blind people have great difficulty to pass safely pedestrian crossings. A possible solution to this problem is to use a smartphone to take images of the pedestrian crossing area and then, using a sophisticated image processing algorithm, to get an output whether it is safe to pass the pedestrian crossing.

  • Hippocampus segmentation in MRI brain scans of mice using same modality image registration

    The project deals with automatic segmentation of the hippocampus from MRI scans of mice brains. The hippocampus is an organ inside the brain, which has roles in short-term memory, spatial memory, and even in navigability, and therefore it has great significance. Nonetheless, manual segmentation of the hippocampus can take long time; therefore, we need an algorithm that will make the segmentation of the hippocampus automatically.

  • Predicting Dyslexia and ADHD using fMRI data

    Classification methods, usually based on a definition of distance between two points. When representing the information we want to classify as covariance matrices, using the definition of the Riemannian distance instead of the Euclidean, produce higher success rates.

  • Real Time Control of Hand Prosthesis Using Surface EMG

    In the USA alone, there are 500,000 below the elbow amputees. The causes vary from trauma, disease, or congenital conditions. Since this condition has a significant impact on the daily functionality of these people, the use of a hand prosthesis can assist in regaining the lost functionality. The current solutions are far from perfect, due to both the price and functionality range. Current hand prostheses range from mechanical prostheses, allowing only a single movement and costing several thousand dollars, to myoelectric prostheses, a prosthetic hand powered by Electromyography (EMG) which allows several movement types but with a price range of 20,000$ and up.

  • Image Processing for Under Water Microscope

    This project handles image processing for the underwater microscope. The purpose of the microscope is to take pictures of planktons – small creatures that lives underwater. Two main problems arose:

    The microscope takes many pictures each second. As a result, in many pictures there are no planktons at all, or they’re size is negligible to the size of the background. In addition, the setup of the microscope caused the images to be with non-uniform illumination.
    In order to make the pro-processing easier for marine biologists, there is a need to classify the images according to the plankton’s specie.

  • project image 3364-2-15

    Classification methods, usually based on a definition of distance between two points. When representing the information we want to classify as covariance matrices, using the definition of the Riemannian distance instead of the Euclidean, produce higher success rates.

  • Super-Resolution In IR Surveillance Videos

    The Telecommunication branch of the IDF adverted the SIPL laboratory with a mission to improve the quality of their video feed originated from security cameras purposed to locate suspicious activities. SIPL was asked to find a solution enabling to detect hidden information or improve existing one in the video and provide new information about the security footage for the user. The solution proposed is the use of SR (super resolution) algorithms.


    Infra-red (IR) image colorization has always been a challenging goal. Reducing human error and speeding up reaction time are just some of the benefits achieved by this process. However, essential differences exist between IR, which are temperature dependent sensors, and regular color visible light ones. These differences cause difficulties when trying to use color images in the rendering process.

  • Deciphering Nervous System Control in Stroke PatientsBased on EMG Recordings

    The project used data that consisted of EMG recordings from both healthy and post-CVA individuals. The subjects were asked to extend their dominant hand towards different directions in front of them, and the goal was to find parameters that could be used to distinguish between the groups.

    The project worked under the assumption that everyday actions (such as walking or gesturing) are constructed from basic muscle activation patterns (called synergies).

    Part 2 of project 2970-1-15

  • Year - 2016
  • Audio QR Over Streaming Media

    We describe a system that delivers a website address to a cellular phone by encoding inaudible binary data in an analogue audio signal, which is received by the microphone of the cellular phone. This is an alternative to encoding a web site address in a QR code label, which is scanned by the cellular phones camera.

  • Celluaar phone

    Smartphone applications provide an opportunity to monitor several physiological vitals regularly at home for indicative and preventive measurements without possessing dedicated clinical devices. A smartphone camera can be used to estimate several vitals, including heart rate and blood pressure, using photoplethysmograms (PPG). PPG is a simple and low-cost technique that can be used to detect blood volume changes in the microvascular bed of tissue making optical measurements at the skin surface.

    There is a wide literature available on estimating systolic and diastolic blood pressure from PPG. These works perform well when applied on clean and noise-free PPG signals. However, PPG signals, captured using smartphones have a very low sampling rate, suffer from ambient lights and are affected by even little finger movement or change in finger pressure. This can largely affect the signal quality and make the signal not reliable enough for estimating blood pressure.

  • Side Scan Sonar Image Compression

    Wireless communication between underwater vehicles such as side scan sonar (SSS) and its operator is crucial for perceiving correct and updated intelligence understanding of the seabed. This has many military applications such as underwater mine discovery, and civilian applications such as seabed texture analysis.

    SSS images usually contain high resolution data, and have high frequency content. Hence, aren’t compressed well by simple compression schemes such as JPEG.
    The Goal of this project is finding a compression algorithm that on the one hand manages to compress SSS with high compression ratios and low complexity, and on the other, preserve the images' features that has intelligence value.

    Few compression schemes that specialize at high resolution image compression were examined, implemented and tested through the project. The best algorithm found, was compared to the JPEG 2000 standard as reference, by having subjective quality assessment tests, and comparing quality factors such as PSNR and SSIM.

    Tags: | | |
  • Hyperspectral Image Cell Segmentation & Tracking

    The project deals with segmentation and classification of florescent samples of lunge cells captured in a hyperspectral microscope (FRET). The output of such a measurement could serve the search for thorough understanding of intercellular biological process and malignant diseases.

  • Body Gestures DJ

    Recently, Intel has released a Perceptual Computing SDK. This software kit uses a depth camera from Creative (similar to Kinect) and enables development of advanced applications which enrich user interaction. This kit provides many abilities, e.g. face and speech recognition, tracking, hand pose and gestures recognition.

  • Robotic Tuner for Classical Guitar

    The Following project addresses the action of tuning a classical guitar.

    This procedure, that is done daily by any guitarist, can proof tedious for the seasoned guitarist and difficult for the novice.

    The project handles this issue as a technical task and, as such, aims to automate nearly all stages of the process

  • Distress Situation Detection in Speech

    When a person is in a distress situation, there are signs which are reflected in his speech or in the audio of its surroundings.

    The project deals with speaker-independent distress detection in speech of a single speaker.

  • Robust Underwater Image Compression

    We describes here a JPEG based compression scheme, adjusted specific for the underwater acoustical channel.
    The project goal was to deal with common bit error rates for the underwater acoustic channel, .
    Tow measures were used to quantify different solutions: compression ratio, which we tried to minimize, and SSIM, a
    measure that describes the quality of the image, which we tried to maximize.

  • Fast HEVC encoding

    In this project we deal with fast HEVC encoding algorithms suited for parallel coding using the GPU. In the recent years the HEVC coding format has become popular and is expected to replace the current used format-H.264. The HEVC format is expected to reach better data compression ability without deteriorating the coding quality.

  • endangered right whales

    The goal of the project is to develop an automatic algorithm for identification and distinction between different right whale items, relying on aerial photographs taken throughout a decade. The project was suggested as a challenge by Kaggle community, which promotes competitions of data processing and learning systems.

  • Point Cloud Registration

    Point clouds are discrete sets of points describing a hyper-surface in a certain dimension. The particular case of refers to real objects and surfaces such as a table, a chair or a part of landscape, where the coordinates are the familiar (X, Y, Z) spatial coordinates.

  • Accelerating regex search using GPGPU

    In this project we implemented a version of grep, the classic UNIX text searching application, that runs on GPUs and utilizes direct access to the file system using GPUfs. The goals of the project was to compare the performance of using the GPU as a co-processor and giving the GPU control over file system access, especially in I/O heavy applications. We wanted to compare these models both in performance and ease of use for the programmer.

  • Heart Rate Measurement From Human Voice

    The project's goal is successfully create a system which measures the human pulse from an audio signal (classification to high or low & regression).
    A database named Munich Biovoice Corpus (MBC) of recordings labeled with pulse and other recordings made by us will be in use.

  • Speaker Diarization Using Dimension Reduction

    Diarization problem is well known problem in the world of speech recognition and speech processing.

    Our project goal is Speaker diarization in recorded conversation. We try a new approach for solving this problem, using dimension reduction algorithm (LLE).

  • Transcoder Video Quality Assessment

    Along with the fast development of the world of visual media, arises the need for developing tools for compression of the raw data. However, in some cases the compressed data is subject to unwanted artifacts. As a way of controlling the compression process, we wish to use an objective quality assessment metric that will give us information on the quality of the compressed data.

  • Video Quality Assessment

    Image & Video quality assessment becomes increasingly important due to the many applications of video where the end user is a human.
    Therefore, it is desirable to develop a visual quality metric that correlates well with human visual perception.
    This paper presents an automatic full-reference image quality assessment technique based on DCT Sub-bands Similarity (DSS).

  • Cry-based Detection of Developmental Disorders in Infants

    Developmental disorders are a group of neurological conditions originating at childhood, that involve serious impairments in various areas (language, learning, motor skills). These conditions also comprise Autism Spectrum Disorders. As of 2008, approximately 15% of children in the United States have been diagnosed with some sort of developmental disorder, is comparison to only 12.8% in 1997 [1]. Early detection of developmental disorders is crucial, as it enables early intervention (e.g. speech therapist, occupational therapy), which may reduce neurological and functional deficits in infants.

  • Background Modeling in Video

    This project involves methods of background modeling in video for the purpose of segmentation between foreground (e.g. people, moving cars etc.) and background (e.g. sidewalks or roads but also more challenging cases such as vegetation and dynamic bodies of water). In numerous computer vision applications in video, a separation between the background and the foreground, which contains interest regions for a human viewer, is often required in the initial stage of processing.

  • Unsupervised Sensor Invariant Indoor Mapping and Localization

    There are many methods for mapping and localization based on sensor measurements and a known functional model. In these methods the creation of the map is performed by applying the functional model on the sensor measurements. The problem is that many times the model is unknown or very complicated.

  • Voice Activity Detection in Presence of Transient

    Voice activity detection has attracted significant research efforts in the last two decades. Despite much progress in designing voice activity detectors (VADs) in presence of stationary noise, voice activity detection in presence of transient noise is a challenging problem.

  • Predicting the Existence of Dyslexia in Children Using fMRI

    Dyslexia is a learning disorder characterized by difficulties with accurate or fluent word recognition and by poor spelling and decoding abilities. Current diagnosis of dyslexia lacks objective criteria, which can decrease treatment efficacy. Diagnosis relies on a discrepancy between reading ability and intelligence, a measure which can be unreliable, and has been criticized for its poor validity.

  • Sub-Nyquist Ultrasound Demo System

    Ultrasound systems play an important role as advanced imaging devices in modern medical applications, as well as in many other areas. The unique properties of the transmitted acoustic waves require advanced hardware solutions in order to sample and process the large amounts of received information into a 2-dimentional image.

  • Action recognition with smart watch

    In this project we deal with the problem of falls, response and identification of the event in the fastest way with minimum miss-detection.
    We'll explain the solutions that are available for this problem, what are their disadvantages and suggest a method to use wearable technology to resolve this problem.
    With wearable technology, we will try to solve this problem in simple ways of threshold comparisons and see why this method is problematic and is not able to give a proper solution to the problem.
    Next, find different methods of machine learning, dimensionality reduction, neural network, which will give some good solution to the problem with high detection rates and no false detection, also explain why other methods could not provide the appropriate solution to the problem that we were looking for

  • PAPR Reduction of OFDM Signals with Channel Coding

    High peak-to-average power ratio (PAPR) is one of the major drawbacks of orthogonal frequency division multiplexing (OFDM) communication schemes. In this report, we propose a novel low complexity and low overhead technique for PAPR reduction, to be called Modulo Technique (MT). We compare our proposed technique to partial transmit sequences (PTS) and show that it achieves greater PAPR reduction while retaining similar complexity. Afterwards, we find a connection between signals randomness and their PAPR. From this connection, we develop a model for estimating PAPR from signal characteristics in the frequency domain. This model can be used for comparing different techniques, and specifically to explain why our technique is superior to PTS. We suggest how other existing PAPR reduction techniques can take advantage of this model to reduce their complexity.

  • Rodent Bones Classification

    Many studies of ornithologists are based on tracking the nutrition of their explored species of birds. This project assists the studying of nutrition patterns of raptors. Raptors are fed from various rodents. The bones of the rodents are indigestible and emitted out. By identifying the rodent species from its bone, one can learn about the nutrition pattern of the raptor. Classifying the rodent species requires high proficiency, and is a time consuming process.
    In this project we introduce a technique to determine the rodent species and the bone type, from a bone picture. This technique will assist researchers as well as amateur birdwatchers (and many students required to do the classification process). The process will be held in two steps. First, a classification of the rodent type will be done through its jaw picture. Second, the rest of the bones in the pellet will be classified. The classification will be achieved using machine learning methods. Taking into account the dataset of images used for training, the proposed technique achieves good results and high accuracy.

  • Block impainting

    In recent years, we observe a trend of video resolution increase. To support this trend, a constant improvement in compression ratio for the same video quality is required. The HEVC standard, which was officially released in 2013, manages to achieve about x2 improvement over its predecessor, H264 and many researchers explore different approaches to improve it even further. One of these approaches is called block-removal.

  • Music Plagiarism Detection Tool

    The project goal is to develop a tool for detecting music plagiarism by comparing melodies. First, the fundamental frequencies of the melody are extracted from two sections of songs where one is suspected of being a melody plagiarism of the other. Different algorithms dealing with melody extraction from monophonic and polyphonic music were tested. The algorithm chosen is Durrieu algorithm, based on audio signal modeling and parameter estimation.

  • Sirens Detection Algorithm in Noisy environment for the Hearing Impaired

    People with Hearing Disabilities experience many difficulties in everyday life that affects them and their surroundings. The technological development in our lives helped them in many areas, but it made Driving even harder experience. They can't hear noises and beeping, but most importantly they can't hear approaching emergency vehicles.
    This inability makes them a safety hazard both to themselves and to their surroundings, because they can accidentally cause a roadblock or even an accident.

  • Year - 2015
  • Microphone array calibration using ambient signals

    Tracking world are dealing with surveillance and finding objects, to have a solid solution to the problem the demand of estimate self-location of the sensors are coming up. In environment without GPS we must have alternative solution to estimate self-location of the sensors.

  • Objects Removal from Crowded Image Background

    Occasionally, while taking a photo, unwanted objects enter the frame.
    For example, when taking pictures using smartphones, in surveillance cameras, etc.
    The project's goal is to allow a user to interactively remove objects from an image background in order to get a clean shot.
    The next part of this project would include development of an Android application which implements the current project.
    The process in which the object is removed starts with taking a short video, in which the last frame is the user's desired photo. The next step is foreground/background segmentation, to discover the unwanted objects using different algorithms. The final step is object removal, in which the object is replaced by its background from another frame, and possible image matting process to improve the final result.
    During the project, 12 videos in different difficulty levels were filmed in order to test the results in different conditions.
    The results are good when using easy/medium videos (static camera, not too crowded area) and require improvement when there are hard conditions (trees, flags in background, unstable camera, etc.)

  • Heart Monitor and Cardiac Arrest Detector For Wearable Devices

    Sudden Cardiac Arrest (SCA) is an unpredictable heart failure. According to the American Heart Association (AHA), 8 out of 10,000 adults will experience out-of-hospital cardiac arrests (OHCA) in the U.S. The problem is that treatment must be received immediately because 6-10 minutes later the person is likely to end up dead...

  • Advanced Framework For Deep Reinforcement Learning

    This project is based on previous work done by Google Deep Mind, in which reinforcement learning was used in order to teach a computer to play computer games on an Atari 2600 game console, which was popular in the 70s and 80s.

  • Real Time Control of Hand Prosthesis Using EMG

    Current solutions for below the elbow amputees include affordable prosthesis allowing only a single movement or highly expensive prosthesis allowing several gestures. In this project, our goal was to design a system that provides an inexpensive, multi-functional solution for the hand prosthesis problem. We construct a real-time, portable system based on the Myo armband and a 3D printed prosthesis and show that this framework can provide a good and inexpensive solution for below the elbow amputees of all ages.

  • Analysis of pied kingfisher foraging pattern in space

    The goal of this project is to extract the coordinate of the kingfisher flight course from two videos acquired by GoPro camera.

    To achieve this goal we have learned and applied methods of matching two images of the scene acquired by adjacent cameras. 5 experiments were done in which we have faced the differences derived from the placement and configuration of the cameras. Additionally we have time synchronized videos from two cameras.

  • Speech Scrambling for SIPER

    Speech Scrambling techniques are used to transform a speech signal into an unintelligible
    signal in order to avoid eavesdropping.
    Such systems are used to guarantee end-to-end security for speech in real time
    communication systems such as GSM, VoIP,
    Telephone, analogue Radio and so on.

  • Adaptive Harmonic Model for Speech Representation and Modification

    Speech signals have been a research topic for over 50 years. However, many research and engineering challenges are still presented in a field of speech modeling and synthesis.

    Speech parameterization techniques that are able, on the one hand to reconstruct a signal transparently, and on the other hand to modify it (in the parametric domain) are very important for flexible speech synthesis and advanced speech transformations (such as voice morphing, emotion modification etc.)

  • Saliency Detection for SIPER V

    Recently, Salient object detection has attracted a lot of interest in computer vision as it essential for many applications such as object detection and recognition, image compression, video summarization and photo collage.

    SIPER is an educational tool developed in SIPL. It demonstrates speech, audio and image processing techniques and can also be used as an analysis tool for research purposes. SIPER allows experimentation with key parameters of each technique and shows both intermediate and final results. It is very modular and its modules can be written in C/C++.

  • Multiclass Classifiers for Brain-Computer Interface Data

    The goal of Brain-Computer Interface (BCI) systems is to enable paralyzed individuals independent control of external devices using the operator’s brain’s activity. As many BCI systems are based on electroencephalographic (EEG) signals to avoid invasive procedures, a consistent challenge is to design more robust and reliable classifiers for these signals. Although BCIs are intended for individuals who cannot move, oftentimes classifiers are calibrated on signals from healthy subjects executing movements.

  • Acoustic Echo Canceler using a Texas Instrument DSP

    The aim of this project is to implement an acoustic echo canceler that works in real-time on Texas Instruments TMS320C6748 DSP Development Kit (LCDK). This implementation may be part of a future undergraduate experiment in SIPL.

    An acoustic echo canceler (AEC) greatly enhances the audio quality of multipoint hands-free communication systems. It allows the participants in a call to speak smoothly, naturally and feel more comfortable.

  • Singing Voice Correction

    Listening to music, as well as singing, are activities most people commonly enjoy. However, not everyone has the ability to sing properly, musically wise. In order to allow the average singer to enjoy his singing without inaccuracies, a wide use of signal processing techniques is being implemented.

  • Music Information Retrieval using Deep Learning and Modification

    MIR is used in applications which organize music databases. It characterizes the user’s musical taste, reproduces and classifies music. The most common method for solving this task uses “Collaborative Filtering”, predicting the taste of one user using data from the other users. The method used in this project is based on processing of the signal itself. As of today, there is no good enough MIR system.

  • Detection of Development Disorders in Infants based on Kinect Movement Analysis

    The goal of this project in to develop a technique for detection of developmental disorders in infants based on analysis of their spontaneous movements. We will use a standard depth camera (Kinect) to track of movements of specific sections of the body. The algorithm will have to be tailored to the configuration of an infant lying and performing spontaneous movements. Then, we will apply machine learning techniques to identify infants that suffer from developmental disorders.

  • Heart Rate Estimation from PPG Signal during Physical Exercise

    We describe an algorithm for estimating heart rate from an optically measured PPG signal when physical exercises are performed.
    In this case, the PPG signal is contaminated by motion artifacts caused by hand movements, making it difficult to find its fundamental frequency that corresponds to the heart rate. To overcome the noise, a soft decision approach is taken, by which several candidates for the fundamental frequency of the PPG signal are extracted and assigned grades.

  • Geometric Modeling of EEG Signals in Alzheimer Patients

    The project's goal is to classify test subjects into two groups: control and patients based on their EEG signals. The secondary goal is to determine the severity of the disease among the patients.
    The methods in use in this project are Manifold Learning and specifically Diffusion Maps. These methods were adapted to the problem, given the EEG data, and implemented in the Time Domain, Frequency Domain and using Scattering Transform.

  • Intonation Correction in Audiobooks for the Blind
    In this project we try to find a way to fix intonation problems in Hebrew speech files, and make them sound “better”. This solution is meant to fix Text-to-Speech sound files, which today have a lot of intonation problems. Audio books are often used by blind / visually impaired people, who cannot read text. Recording a whole book is hard and requires good reading skills. Therefore audiobooks are read today...
  • Text Detection in Images using Deep Learning

    Text detection in natural scene images is an important preprocessing for many content-based image analysis tasks. Deep learning is a set of brain-inspired algorithms that involves deep multi-layered neural networks. These neural networks are trained to find a set of features that represent the fed data, thus allowing machine learning and computer-vision usages such as classification and detection. This approach is the state-of-the-art in the field of computer-vision, voice recognition and natural language processing, and used by Google, Microsoft, Yahoo and more.

  • Implementation of the ‘engine' of Multi-touch table

    The goal in the project is to repair and improve the processing of an image displayed upon a touch screen, in order to gain the ability of shape recognition. The aspect of image processing dealt with in the project, is the combing and stitching of three images derived from three different cameras into one single image that describes the surface of the touch screen.

  • Computer Aided Graphology

    The project’s goal is developing a system such that its input is a person’s hand-writing and output is a personality evaluation according to a graphological analysis.
    Graphology is the analysis of handwriting. It helps identifying and evaluating the writer, indicating their psychological state at the time of writing. The basic assumption of graphology is that writing is a reflex and therefore reflects a person’s personality traits. Graphology has a part in many profiling applications. An example for such a process is choosing people for jury duty in the U.S. judicial system.

  • Word Classification in Children Speech using Scattering Transform

    This project was conducted within the frame of a Magneton with the company LinguisTech. In this project we explored different speech recognition methods and tested them on speech recordings of children. Particularly, we examined the benefits of using the Scattering transform as a feature extraction method using different known classification algorithms such as GMM and SVM. We compared the performance of the features from the Scattering transform to the features of the MFCC which are known in the literature as efficient audio descriptors. The Scattering coefficients are a general case of the MFCC coefficients and are characterized by their stability to numerous signal deformations which allows successful classification, as was demonstrated in other tasks, such as image texture and musical genre classification.

  • Fast High Efficiency Video Coding (HEVC)

    High Efficiency Video Coding (HEVC) is a new video coding standard that has recently been finalized. Due to its substantially improved performance, it is expected to replace the H.264 video coding standard and to become the most common video coding technique in few years. A major innovation in HEVC is the use of a quad-tree based coding tree block for images.

  • Epson Moverio BT-200 Smart Glasses

    Augmented Reality is a computing technology that uses a copy of reality, in which virtual elements are combined with the real environment in real-time. Technology is implemented by a user looking through a semi-transparent medium, when the implementation projects virtual information through it.
    The project's main goal is calibration of camera glasses and user’s eyes in order to coordinate their point of view.

  • Year - 2014
  • Simulation of a Rotary Speaker

    This project deals with creating a computerized simulation of a rotating speaker called Leslie Speaker, so that it would be possible to create the same effect without the speakers themselves.

  • Low rate underwater video transmission

    the goal of the project is to find an algorithm for underwater video compression, in order to enable a video transmission in a good video quality, under the constraints of a low channel capacity of the underwater acoustic channel, by using the redundancies of the underwater video, like slow motion, blured background, and strong correlation between the color channels.

  • Fast QuadTree Partitioning for High Efficiency Video Coding (HEVC)

    A new generation video coding standard, named High Efficiency Video Coding (HEVC) , has been developed by JCT-VC. This new standard provides a significant improvement in picture quality, especially for high resolution videos.
    HEVC adopts a QuadTree (QT) based Coding Unit (CU) block partitioning structure, which is flexible to adapt to various texture characteristics of images, is created for the encoding and decoding processes and the Rate Distortion (RD) cost is calculated for all possible dimensions of CUs in the QT.

  • Tone Mapping of SWIR Images

    Sensing in the Shortwave Infrared (SWIR) range has only recently been made practical. The SWIR
    band has an important advantage – it is not visible to the human eye, but since it is reflective,
    shows shadows and contrast in its imagery. Moreover, SWIR sensors are highly tolerant to
    challenging atmospheric conditions such as fog and smoke. They can be made extremely sensitive,

  • wind guitar

    The goal of our project is to produce music from harmonic signals, supplied by a system we built as the first part of our project. This system activates as an anemometer – recognizing different directions of the wind, as well as different velocities. One of our project's requirements is that the system will be easy to use and that the sounds produced by it will be pleasant to the ear.

  • Matching of Consecutive Video Frames

    Motion Estimation of objects between consecutive video frames pose a unique challenge due to the difficulty to implement standard motion estimation techniques in this case, given the lack of distinctive features and the great difference between the frames. In this project we developed a system which, given 2 consecutive video frames, matches large objects between them and estimates the translation transformation from one frame to another, while dealing with occlusions.

  • project-image-2216-1-14 3D full object reconstruction from Kinect

    The main goal of the first part of the project was to perform an Iterative Closest Point registration on two depth maps obtained using the Kinect depth sensor in C++ on the windows platform. The other purposes of this first part was to learn how to integrate alone big libraries

  • Audio Matching in a Mobile Environment

    The issue of audience measurements is important to many entities in the Communications and advertising world. The ability to monitor audience measurements efficiently and reliably has great potential to those entities.
    The idea is to sample data from mobile phones in order to monitor audience measurements, based on the fact that everyone has smartphone nowadays.

  • Eye-Blinking Detection in EEG Signals

    EEG (Electro-Encephalography) is the recording of electrical activity along the scalp. The EEG signals are produced by 64 electrodes along the scalp, and are the basis of any BCI system (Brain Computer Interface). BCI technology is capable of making world revolution in the way people communicate.

  • Convolution and Normalized Cross Correlation on Kepler Architecture

    Since the introduction of Intel 4004 (the first commercial microprocessor) and even with today's multicore chips, there is and always will be a need for computers with greater processing power. For 30 years this was achieved by increasing the CPU clocks, however, because of numerous physical limitations in the fabrication process of integrated circuits, it was uneconomical to continue with this trend. Only recently the computing market shifted towards parallel software and hardware design in sough of performance increase. Modern graphics processing units (GPUs) are specialized circuits initially designed for the computer gaming market,

  • Augmented Reality Pinball

    Created in an undergraduate project in the Signal and Image Processing Laboratory (SIPL), Department of Electrical Engineering, Technion – Israel Institute of Technology.
    We have built a standalone, robust and portable platform that enables an augmented reality pinball game based on virtual and real objects and using common hardware.

  • Low Complexity Image Compression of Capsule Endoscopy Images

    Capsule endoscopy is a method for recording images of the digestive tract. A patient swallows a capsule containing a tiny camera, which captures images that are then transmitted wirelessly to an external receiver for examination by a physician. Due to limited computational capabilities in the capsule and bandwidth constraints derives from dimensions of capsule, low-complexity and efficient compression of the images is required before transmission.

  • Video Quality Assessment Prototype System

    Video quality assessment becomes increasingly important nowadays. Therefore, it is desirable to develop a visual quality metric that correlates well with human visual perception.

  • Compressed Sensing Based Interior Tomography

    The CT (Computerized Tomography) scan enables estimation of the interior of a scanned object but involves exposure to high amounts of radiation. In some cases it is desirable to reconstruct only a local region of interest (ROI) with fewer measurements and as a result, less radiation.

  • Android to MIDI Pitch

    In this project we investigate and implement body movement tracking for sound modulation. The idea began when one of the student wanted to have the ability to pitch bend like guitar players while playing with piano. The basic idea is to track the movement of motion sensor (in this project with android sensors API) and modulate the sound playing by the musician according to the sensors data. The first challenge was to connect all the devices to the computer, in this project the backbone for the connection is UDP over Wi-Fi connection

  • Lips Region Detection for Visual Speech Processing

    The performance of traditional speech algorithms, which are based on audio signal, deteriorates in highly non stationary acoustic environment. Visual information is immune for that type of interruptions and it is helpful for speech perception. Variety of speech algorithms, which are based on the visual signal, assume that a bounding box of the location of lips is know. Therefore, the performance of these algorithms depends on accurate detection of the bounding box of lips. In this project we present two algorithms for accurate detection of lips bounding box.

  • Saliency Detection

    The necessity of Detection and Recognition of main point of interest or salient point has become very common due to the large tracking application and adaptive image compression applications. Every application oriented to the open space shall need to acquire the relevant target to be focused on. Saliency Detection is the area dealing with Detection and Recognition of main point of interest in a picture. This action, which is very basic to human brain and eye, is very complex to the machine.

  • Automatic improvement of singing voice

    In the nowadays western music, it is common that a song is played on semi-tone scale. This means that the notes of the song can be chosen only from a specific set of frequencies, called the semi-tone scale. For example, the set of notes of a piano or a guitar is finite, and singing or playing "between" the notes, is considered being out of tune. Therefore, singing out of tune can be defined by the amount of deviation from this scale.

  • Movie Subtitle Extraction

    As some movies, and specifically opera videos, contains embedded subtitles without accompanied text files, there is a need for a robust system capable of extracting these subtitles from the movie and into readable text. Extraction of
    this information involves detection, localization, tracking, enhancement, and recognition of the text from a given image.

  • Motion Analysis Using Kinect for Monitoring Parkinson Disease

    Parkinson's Disease (PD) is a degenerative disease of the central nervous system with a profound effect on the motor system. Symptoms include slowness of movement, rigidity of motion and in some patients, tremor.
    The severity of the disease is quantified using the Unified Parkinson Disease Rating Scale (UPDRS) which is a subjective scale performed and scored by physicians.

    In this work, we present an automated, objective quantitative analysis of four UPDRS motor examinations of Hand Movement and Finger Taps.

  • Fish Analysis in Video

    In recent years, we can see an increasing use of image processing systems in various business sectors, such as agriculture, where quality testing and monitoring processes are still performed manually. Because of low equipment prices and advances in the field of image processing, many traditional areas are leaning towards automation solutions.
    In this project we attempt to characterize fish behavior in pools during the day and build up a system that will alarm in case of untypical fish behavior, such a behavior may indicate of a change in the fish living conditions or diseases.

  • Year - 2013
  • Video Compression for Underwater Acoustic Communication

    Today, the vast majority of online video systems are wired, enabling high bit rate communications with the cost of range and mobility limitations. Wireless underwater acoustic modems have been developed in the past few years. Using orthogonal frequency division multiplexing (OFDM), rates up to tens of kilobits-per-second were reached.

  • Shadow Detection in Aerial Images Using Side Information

    The subject of this project is shadow detection in aerial images using details of the flight: time, position and the plane’s direction during the photograph. Intelligence decoding of images is interested in temporal differences in images. These differences include the unneeded shadow differences, since the image is captured in different time, location and flight directions.

  • Stuttering Detection for Android

    Dysfluency and stuttering are a break or interruption of normal speech such as repetition, prolongation, interjection of syllables, sounds, words or phrases and involuntary silent pauses or blocks in communication.

    The goal of this project is building an algorithm for detecting stuttering in high reliability and as less false alarm as possible.

  • Sub-Nyquist Methods in 3D Ultrasound Imaging

    Contemporary sonography is performed by digitally beamforming signals sampled by several transducer elements placed upon an array. High-resolution digital beamforming introduces the demand for a sampling rate significantly higher than the signal's Nyquist rate, which greatly increases the volume of data that must be processed. In 3D ultrasound imaging, the amount of sampled data is vastly increased with respect to 2D imaging.

  • Sub-Nyquist Ultrasound Demo System

    Ultrasound systems play an important role as advanced imaging devices in modern medical applications, as well as in many other areas. The unique properties of the transmitted acoustic waves require advanced hardware solutions in order to sample and process the large amounts of received information into a 2-dimentional image.

  • Sound Texture Synthesis

    ynthesis of sound textures, such as rain, wind or crowd applause, is an important and useful application in many domains such as the movies industry, multimedia, video games, music and more.

    The necessity for a flexible and efficient synthesis method which can produce a natural sound is obvious, yet, up to this day, no automatic synthesis method is found in common use.

1 2