Overflows in urban drainage structures, or sewers, must be prevented on time to avoid their undesirable consequences. An effective monitoring system able to measure volumetric flow in sewers is needed. Existing stateof-the-art technologies are not robust against harsh sewer conditions and, therefore, cause high maintenance expenses. Having the goal of fully automatic, robust and non-contact volumetric flow measurement in sewers, we came up with an original and innovative idea of a vision-based system for volumetric flow monitoring. On the contrast to existing video-based monitoring systems, we introduce a second camera to the setup and exploit stereo-vision aiming of automatic calibration to the real world. Depth of the flow is estimated as the difference between distances from the camera to the water surface and from the camera to the canal’s bottom. Camerato-water distance is recovered automatically using large-scale stereo matching, while the distance to the canal’s bottom is measured once upon installation. Surface velocity is calculated using cross-correlation template matching. Individual natural particles in the flow are detected and tracked throughout the sequence of images recorded over a fixed time interval. Having the water level and the surface velocity estimated and knowing the geometry of the canal we calculate the discharge. The preliminary evaluation has shown that the average error of depth computation was 3 cm, while the average error of surface velocity resulted in 5 cm/s. Due to the experimental design, these errors are rough estimates. At each acquisition session the reference depth value was measured only once, although the variation in volumetric flow and the gradual transitions between the automatically detected values indicated that the actual depth level has varied. We will address this issue in the next experimental session.