Why Every IP Camera App Sucks (And How We Fixed It)
25 days ago

The Context
The native camera app we were stuck with was genuinely awful. It took forever to load, lagged on touch interactions, had zero customization for layouts, and no support for features like picture-in-picture or proper background operation. Every interaction felt unnecessarily frustrating—like wading through layers of UI designed by someone who never actually used their own app.
We needed a camera control UI that wasn't just usable—it had to feel effortless. Something clean, fast, responsive, and tailored to exactly how we wanted to interact with our cameras, especially on mobile. We wanted proper multi-camera views, intuitive gestures, reliable background streaming, and meaningful integration with broadcast tools like OBS Studio.
The Problem
Rather than hacking together a weekend fix, we approached this methodically, tackling each problem step by step:
- Slow, Unresponsive Interface: The original app took forever to load and every touch interaction felt laggy
- Zero Layout Flexibility: Forced into rigid layouts with no customization—no clean grid views, no picture-in-picture
- Poor Background Support: Streams would freeze when the app went to background, requiring manual reloads
The Solution
Future-Proof Foundation with React 19
Logic: We needed a foundation that would load quickly and feel responsive. React 19's improved performance and better rendering model were crucial for handling the frequent state updates required in real-time camera control.
Implementation: Built a composition-based architecture with reusable custom hooks:
// Custom hooks for complex camera operations const { selectedCamera, moveCamera, startContinuousMove, stopContinuousMove, gotoPreset, } = useAppContext(); const { handlePressStart, handlePressEnd, swipeHandlers } = useMovementControls();
Nuance: React 19's compiler handles memoization automatically, but we still use it for expensive operations like gesture recognition calculations.
Canvas-Based Video Streaming Architecture
Logic: We needed smooth real-time video with zoom and pan operations without performance hits.
Implementation: Canvas-based MJPEG streaming player with real-time transformations:
// Custom canvas-based player with pan/zoom support const CanvasStreamPlayer = forwardRef<CanvasStreamPlayerRef, Props>( ({ src, onError, onRefresh, onCameraOverlay, isOverlayOpen }, ref) => { const [transform, setTransform] = useState({ scale: 1, x: 0, y: 0 }); // Gesture handling with @use-gesture/react useGesture({ onPinch: ({ offset: [scale], origin: [ox, oy] }) => { // Complex pinch-to-zoom logic with origin tracking }, onDrag: ({ movement: [mx, my] }) => { // Pan handling with boundary constraints }, onWheel: ({ event, delta }) => { // Mouse wheel zoom with cursor-based origin }, }); } );
Nuance: The main complexity came from ensuring Picture-in-Picture works properly—copying the zoomed/panned view from the main screen to the PiP window so they stay synchronized.
Progressive Web App for Easy Access
Logic: The native app lacked proper background support—streams would freeze and feeds had to be manually reloaded. We built this as a PWA for easy access and to ensure background refresh and clock functionality worked properly.
Implementation: Service worker with background sync capabilities:
// Background sync for maintaining stream health export const usePWABackgroundSync = () => { const executeBackgroundTasks = useCallback(async () => { const tasks = []; // Refresh camera list tasks.push(loadCameraList()); // Restart stream if needed if (!isStreaming) { tasks.push(startStreamAndSetURL()); } await Promise.allSettled(tasks); }, [loadCameraList, isStreaming, startStreamAndSetURL]); };
Dual-Mode Gesture Recognition
Logic: We needed quick navigation between presets, intuitive camera switching, and direct camera control—all through gestures. Mixing these interactions would lead to accidental movements, so we needed clear mode separation.
Implementation: Mode-aware gesture system with timing thresholds:
const swipeHandlers = useSwipeable({ onSwipeStart: (eventData) => { if (cameraMode === "move") { const { dir } = eventData; // Start continuous movement timer longSwipeTimeoutRef.current = setTimeout(async () => { await startContinuousMove(cameraId, direction); }, 500); } }, onSwipedLeft: () => { if (cameraMode === "normal") { navigatePreset("next"); } else { handleCameraMove("left"); } }, });
Nuance: The 500ms threshold distinguishes quick movements from continuous panning—shorter times led to accidental continuous movements, longer times made discrete movements feel sluggish.
Picture-in-Picture with Transform Preservation
Logic: The original app lacked proper PiP support. We wanted PiP that maintained camera zoom and pan states consistently for multitasking scenarios.
Implementation: Real-time canvas streaming that preserves transform state:
const togglePiP = async () => { const canvas = canvasRef.current; if (!canvas) return; // Create real-time canvas stream const stream = canvas.captureStream(30); const video = document.createElement("video"); video.srcObject = stream; video.muted = true; await video.play(); await video.requestPictureInPicture(); // Maintain transform state in PiP const loop = () => { const ctx = canvas.getContext("2d"); const transform = transformRef.current; // Apply current zoom/pan to PiP window ctx.save(); ctx.translate( canvas.width / 2 + transform.x, canvas.height / 2 + transform.y ); ctx.scale(transform.scale, transform.scale); ctx.drawImage(sourceImage, -width / 2, -height / 2, width, height); ctx.restore(); requestAnimationFrame(loop); }; loop(); };
The Impact
The result was a camera control app that genuinely felt responsive, customizable, and tailored to real-world usage. By prioritizing clean design, adding gesture-based interactions, and making PWA capabilities actually work, we transformed the frustrating experience of the old app into something that actually felt good to use.
- Immediate and responsive interactions: We did away with slow-loading frustrations
- Customizable camera views: Including proper picture-in-picture, none of the the rigidity of native layouts
- Robust background capabilities: Made the app dependable and usable in daily scenarios
What we built wasn't overengineered - it was what should have existed in the first place. While traditional React projects often never have to rely on real time streaming, background sync, pixel perfect responsiveness and gesture based interactions, it was rewarding to build something I use daily.