Why Every IP Camera App Sucks (And How We Fixed It)

25 days ago

Technologies
React.jsPWAs
Illustration

The Context

The native camera app we were stuck with was genuinely awful. It took forever to load, lagged on touch interactions, had zero customization for layouts, and no support for features like picture-in-picture or proper background operation. Every interaction felt unnecessarily frustrating—like wading through layers of UI designed by someone who never actually used their own app.

We needed a camera control UI that wasn't just usable—it had to feel effortless. Something clean, fast, responsive, and tailored to exactly how we wanted to interact with our cameras, especially on mobile. We wanted proper multi-camera views, intuitive gestures, reliable background streaming, and meaningful integration with broadcast tools like OBS Studio.

The Problem

Rather than hacking together a weekend fix, we approached this methodically, tackling each problem step by step:

  1. Slow, Unresponsive Interface: The original app took forever to load and every touch interaction felt laggy
  2. Zero Layout Flexibility: Forced into rigid layouts with no customization—no clean grid views, no picture-in-picture
  3. Poor Background Support: Streams would freeze when the app went to background, requiring manual reloads

The Solution

Future-Proof Foundation with React 19

Logic: We needed a foundation that would load quickly and feel responsive. React 19's improved performance and better rendering model were crucial for handling the frequent state updates required in real-time camera control.

Implementation: Built a composition-based architecture with reusable custom hooks:

// Custom hooks for complex camera operations
const {
  selectedCamera,
  moveCamera,
  startContinuousMove,
  stopContinuousMove,
  gotoPreset,
} = useAppContext();

const { handlePressStart, handlePressEnd, swipeHandlers } =
  useMovementControls();

Nuance: React 19's compiler handles memoization automatically, but we still use it for expensive operations like gesture recognition calculations.

Canvas-Based Video Streaming Architecture

Logic: We needed smooth real-time video with zoom and pan operations without performance hits.

Implementation: Canvas-based MJPEG streaming player with real-time transformations:

// Custom canvas-based player with pan/zoom support
const CanvasStreamPlayer = forwardRef<CanvasStreamPlayerRef, Props>(
  ({ src, onError, onRefresh, onCameraOverlay, isOverlayOpen }, ref) => {
    const [transform, setTransform] = useState({ scale: 1, x: 0, y: 0 });

    // Gesture handling with @use-gesture/react
    useGesture({
      onPinch: ({ offset: [scale], origin: [ox, oy] }) => {
        // Complex pinch-to-zoom logic with origin tracking
      },
      onDrag: ({ movement: [mx, my] }) => {
        // Pan handling with boundary constraints
      },
      onWheel: ({ event, delta }) => {
        // Mouse wheel zoom with cursor-based origin
      },
    });
  }
);

Nuance: The main complexity came from ensuring Picture-in-Picture works properly—copying the zoomed/panned view from the main screen to the PiP window so they stay synchronized.

Progressive Web App for Easy Access

Logic: The native app lacked proper background support—streams would freeze and feeds had to be manually reloaded. We built this as a PWA for easy access and to ensure background refresh and clock functionality worked properly.

Implementation: Service worker with background sync capabilities:

// Background sync for maintaining stream health
export const usePWABackgroundSync = () => {
  const executeBackgroundTasks = useCallback(async () => {
    const tasks = [];

    // Refresh camera list
    tasks.push(loadCameraList());

    // Restart stream if needed
    if (!isStreaming) {
      tasks.push(startStreamAndSetURL());
    }

    await Promise.allSettled(tasks);
  }, [loadCameraList, isStreaming, startStreamAndSetURL]);
};

Dual-Mode Gesture Recognition

Logic: We needed quick navigation between presets, intuitive camera switching, and direct camera control—all through gestures. Mixing these interactions would lead to accidental movements, so we needed clear mode separation.

Implementation: Mode-aware gesture system with timing thresholds:

const swipeHandlers = useSwipeable({
  onSwipeStart: (eventData) => {
    if (cameraMode === "move") {
      const { dir } = eventData;
      // Start continuous movement timer
      longSwipeTimeoutRef.current = setTimeout(async () => {
        await startContinuousMove(cameraId, direction);
      }, 500);
    }
  },
  onSwipedLeft: () => {
    if (cameraMode === "normal") {
      navigatePreset("next");
    } else {
      handleCameraMove("left");
    }
  },
});

Nuance: The 500ms threshold distinguishes quick movements from continuous panning—shorter times led to accidental continuous movements, longer times made discrete movements feel sluggish.

Picture-in-Picture with Transform Preservation

Logic: The original app lacked proper PiP support. We wanted PiP that maintained camera zoom and pan states consistently for multitasking scenarios.

Implementation: Real-time canvas streaming that preserves transform state:

const togglePiP = async () => {
  const canvas = canvasRef.current;
  if (!canvas) return;

  // Create real-time canvas stream
  const stream = canvas.captureStream(30);
  const video = document.createElement("video");
  video.srcObject = stream;
  video.muted = true;
  await video.play();
  await video.requestPictureInPicture();

  // Maintain transform state in PiP
  const loop = () => {
    const ctx = canvas.getContext("2d");
    const transform = transformRef.current;

    // Apply current zoom/pan to PiP window
    ctx.save();
    ctx.translate(
      canvas.width / 2 + transform.x,
      canvas.height / 2 + transform.y
    );
    ctx.scale(transform.scale, transform.scale);
    ctx.drawImage(sourceImage, -width / 2, -height / 2, width, height);
    ctx.restore();

    requestAnimationFrame(loop);
  };
  loop();
};

The Impact

The result was a camera control app that genuinely felt responsive, customizable, and tailored to real-world usage. By prioritizing clean design, adding gesture-based interactions, and making PWA capabilities actually work, we transformed the frustrating experience of the old app into something that actually felt good to use.

  • Immediate and responsive interactions: We did away with slow-loading frustrations
  • Customizable camera views: Including proper picture-in-picture, none of the the rigidity of native layouts
  • Robust background capabilities: Made the app dependable and usable in daily scenarios

What we built wasn't overengineered - it was what should have existed in the first place. While traditional React projects often never have to rely on real time streaming, background sync, pixel perfect responsiveness and gesture based interactions, it was rewarding to build something I use daily.