Building a scalable file picker with React and Firebase

Just a guy who loves to write code and watch anime.
Introduction
I want to share my experience at my previous startup where we built a file and folder picker.
This will focus on React and performance.
We also used Firebase, so that's an interesting part as well.
The initial approach
Initially, we went with a recursive approach.
type FileSystemNode = {
id: string
name: string
type: 'file' | 'folder'
isSelected?: boolean
children?: FileSystemNode[]
}
This would be a full object stored in document inside Firebase. Mind you, we're dealing with thousands of files and folders.
This is the natural instinct. Folders have files. You simply store the files of a folder in children.
children of course wouldn't exist if it's a file.
Before we talk about the problem we had with Firebase, let's talk about the problem we had with React.
Given a folder or the root, we show all files and folders. You can select a file or folder. You can also navigate into a folder and select files or folders inside it.
Let's look at some pseudo code:
const FileBrowser = () => {
const [fileSystem, setFileSystem] = useState<FileSystemNode>({
/*...*/
});
const toggleSelect = (id: string) =>
useCallback(() => {
// ...
});
return <FolderContents node={fileSystem} onSelect={toggleSelect} />;
};
const FolderContents = memo(({ node, onSelect }) => {
// Should separate folders and files
const folders =
node.children?.filter((child) => child.type === "folder") || [];
const files = node.children?.filter((child) => child.type === "file") || [];
return (
<div>
{/* Show folders first */}
{folders.map((folder) => (
<FolderItem
key={folder.id}
node={folder}
onSelect={onSelect}
onNavigate={onNavigate} // Different behavior for folders
/>
))}
{/* Then files */}
{files.map((file) => (
<FileItem key={file.id} node={file} onSelect={onSelect} />
))}
</div>
);
});
Let's look at some more code:
This would be the function that updates the selection of a node.
const updateNodeSelection = (
node: FileSystemNode,
targetId: string
): FileSystemNode => {
if (node.id === targetId) {
// Found the node to update
return {
...node,
isSelected: !node.isSelected,
};
}
// Need to recursively check children
if (node.children) {
return {
...node,
children: node.children.map((child) =>
updateNodeSelection(child, targetId)
),
};
}
return node;
};
This is how toggleSelect would look like:
const toggleSelect = () =>
useCallback((id: string) => {
setFileSystem((prev) => updateNodeSelection(prev, id));
}, []);
The problem we have here is that all references from the target node all the way up the root will have a new reference. Therefore, we'd be create unnecessary re-renders for the items that don't need to be re-rendered. The reason for this is because state updates in React need to be immutable. Meaning you create new objects and don't modify what's in place, which is something you'd do with useRef in React.
What about memoization?
You might ask, why not memoize the FolderContents and FileItem components?
Well, let's look at that!
const FolderItem = memo(({ node, onSelect, onNavigate }) => {
return (
<div className="flex items-center gap-2 p-2 hover:bg-gray-100">
<input
type="checkbox"
checked={node.isSelected}
onChange={() => onSelect(node.id)}
onClick={(e) => e.stopPropagation()} // Prevent folder navigation
/>
<div
onClick={() => onNavigate(node.id)}
className="flex items-center gap-1 cursor-pointer"
>
๐ {node.name}
</div>
</div>
);
});
const FileItem = memo(({ node, onSelect }) => {
return (
<div className="flex items-center gap-2 p-2 hover:bg-gray-100">
<input
type="checkbox"
checked={node.isSelected}
onChange={() => onSelect(node.id)}
/>
<div className="flex items-center gap-1">๐ {node.name}</div>
</div>
);
});
The problem we have here is that all the references from the target node all the way up the root will have a new reference. So, we can't prevent re-renders from happening for those cases. Yes, for the other ones e.g. objects that are deeper in the tree, we could prevent re-renders from happening because the references for them would be the same (weโd need to use the callback function memo provides).
Second approach (more optimal but not perfect yet)
The second approach would be to use a flat structure.
It would look something like this:
type FlatNode = {
id: string;
parentId: string | null;
type: "file" | "folder";
name: string;
isSelected: boolean;
};
const [nodes, setNodes] = useState<Record<string, FlatNode>>({
root: { id: "root", parentId: null, type: "folder", name: "Root" },
file1: { id: "file1", parentId: "root", type: "file", name: "File 1" },
});
We'd update the state like this:
setNodes((prev) => ({
...prev,
file2: { ...prev.file2, isSelected: true },
}));
The nice part here is that we're only updating the reference of the target node.
All the other nodes inside the tree will have the same reference. They're all siblings to be clear.
The file item would be different however. I'm gonna show the example for the file item only, it's a similar approach for the folder one:
const FileItem = memo(
({ nodeId, nodes, onSelect }) => {
const node = nodes[nodeId];
return (
<div className="flex items-center gap-2 p-2 hover:bg-gray-100">
<input
type="checkbox"
checked={node.isSelected}
onChange={() => onSelect(node.id)}
/>
<div className="flex items-center gap-1">๐ {node.name}</div>
</div>
);
},
(prevProps, nextProps) =>
prevProps.nodes[prevProps.nodeId] === nextProps.nodes[nextProps.nodeId]
);
Here, we compare the previous and next node by their references before deciding if we should re-render. And this is a callback you can pass to memo to control the comparison.
So we're saying in human words: "If the previous node has the same reference as the next node, then don't re-render". If this is not true, then we re-render.
This is a step in the right direction, but it's not perfect yet.
I mentioned we used Firebase.
The max size for a document in Firebase is 1MiB. Which is 1,048,576 bytes.
We're dealing with thousands of files and folders (more to be honest). No more than 10,000 if not ~12,500 objects would be possible to store in a single document.
This is why we needed to find a better solution.
Third approach (database related)
The better solution is to have subcollections for each entity.
This would scale to thousands and hundred of thousands of objects.
// Folders collection
{
'root': {
id: 'root',
name: 'Root',
parentId: null // null indicates this is root
},
'folder1': {
id: 'folder1',
name: 'Documents',
parentId: 'root'
}
}
// Files collection
{
'file1': {
id: 'file1',
name: 'File 1',
parentId: 'folder1'
}
}
Now, if you find yourself in a situation where you're hitting the limits of your current stack, in our case, Firebase, then you should think out of the box and see how you can solve the problem. Especially if you're deep in and can't suddenly change your stack.
PS. How we'd query the data:
// Get contents of current folder
const getCurrentFolderContents = async (folderId: string) => {
const filesQuery = query(filesCollection, where("parentId", "==", folderId));
const foldersQuery = query(
foldersCollection,
where("parentId", "==", folderId)
);
const [files, folders] = await Promise.all([
getDocs(filesQuery),
getDocs(foldersQuery),
]);
};






